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ISOFORMS OF THE HUMAN VITAMIN D RECEPTOR 



Field of the Invention ; 

The present invention relates to isolated polynucleotide molecules 
5 which encode novel isofonns of the human Vitamin D receptor (hVDR) or 
variant transcripts for liVDR. The polynucleotide molecules may be utilised 
in, lor example, methods of screening compounds for VDR agonists and/or 
antagonists. 

10 Background of the lnvention :- 

The active hormonal form of vitamin D, 1.25-dihydroxyvitamin 
(1.25(OH)^D 3 ), has a central role in calcium and phosphate homeostasis, and 
the maintenance of bone. Apart from these calcitropic effects. 1.25-fOH] 2 D< 
has been shown to plav a role in controlling cell growth and differentiation in 

IS many target tissues. The effects of l,25-(OH)-,D, are mediated by a specific 
receptor protein, the vitamin U receptor (VDR). a member of the nuclear 
receptor superfamily of transcriptional regulators which also includes 
steroid, thyroid and retinoid receptors as well as a growing number of orphan 
receptors. Upon binding hormone the VDR regulates gene expression by 

21) direct interaction with specific sequence elements in the promotor regions of 
hormone responsive target genes. This transactivation or repression involves 
multiple interactions with other protein cofactors. heterodimerisation 
partners and the transcription machinery. 

Although a cDNA encoding the human VDR was cloned in 1988 (1), 

25 little has been documented characterising the gene structure and pattern of 
transcription since that time. The regulation of VDR abundance is one 
potentially important mechanism for modulating 1.2r>-(UH) 2 D, 
responsiveness in target cells. It is also possible that VDR has a role in non- 
transcriptional pathways, perhaps via localization to a non-nuclear 

30 compartment and/or interaction with components of other signalling 

pathways. However, the question of how VDRs are targetted to different cell 
types and how they are regulated remains unresolved. There have been many 
reports in the literature describing translational or tianscriptional control of 
VDR levels, both homologously and heterologouslv. mostly in non-human 

35 svstems. 
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A recent study (2) showed that in the kidney, alternative splicing of 
human VDR transcripts transcribed from a GCJ rich promoter generates 
several transcripts which vary only in their 5' UTRs The present inventors 
have now identified further upstream cxons of the VDR gene which generate 
5 5' variant transcripts, suggesting that the expression of the VDR gene is 
regulated by more than one promoter. A subset of these transcripts is 
expressed in a restricted tissue-specific pattern and further variant transcripts 
have the potential to encode an N-termmally variant protein. These results 
may have implications for understanding the actions of 1.25-(OH) 2 D : , in 

10 different tissues and cell types, and the possibility that N-termmally variant 
VDR proteins may be produced has implications for altered activities such as 
transactivation function or subcellular localisation of the receptor protein. 
Furthermore, these variants, by their level, tissue specificity, subcellular 
localisation and functional activity, may yield targets for pharmaceutical 

15 intervention. The variants may also be useful in screening potential analogs 
and/or antagonists of vitamin D compounds. 

Disclosure of the tnvention :- 

In a first aspect, the invention provides an isolated polynucleotide 
20 molecule encoding a human Vitamin D receptor (hVDR) isoform, said 

polynucleotide molecule comprising a nucleotide sequence which includes 
sequence that substantially corresponds or is functionally equivalent to that 
of exon Id of the human VDR gene. 

Exon Id (referred to as exon lb in the Australian Provisional Patent 
25 Specification No. PO9500) is a 96 bp exon located 29b bp downstream from 
exon la (2). The sequence of exon id is: 

5'GTTrCCTTC r rTCTGrCGGCiGCGCCTTGGCATGGAGTGGAGGA/\TAAGAA 
AAGGAGCGATrGG(;TGTCGATGGTGCTCAG^\C r rGCTGGAGTGGAGG3' 
30 (SEQ ID NO: 1). 

The nucleotide sequence of the polynucleotide molecule of the first 
aspect of the invention, preferably does not include sequence corresponding 
to that of exon la. exon If and/or exon le. However, the nucleotide sequence 
35 of the polynucleotide molecule of the first aspect of the invention, may or 
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may not include sequence that substantially corresponds or is functionally 
equivalent to that of exon lb and/or exon le. 

Preferably, the polynucleotide molecule of the first aspect comprises a 
nucleotide sequence which includes; 
5 (i) sequence that substantially corresponds or is functionally 

equivalent to that of exons Id, lc and 2-9 and encodes a VDR isoform of 
approximately 477 amino acids, 

(ii) sequence that substantially corresponds or is functionally 
equivalent to that of exons id and 2-9 and encodes a VDR isoform of 

10 approximately 450 ammo acids, or 

(iii) sequence thai substantially corresponds or is functionally 
equivalent to that of exons id and 2-y and further includes a 152 bp intronic 
sequence, and encodes a truncated VDR isoform of approximately 72 amino 
acids. 

15 Most preferably, the polynucleotide molecule of the first aspect of the 

invention comprises a nucleotide sequence substantially corresponding to 
that shown as SEQ ID NO: 2. SEQ ID NO: 3 or SEQ ID NO: 4. 

In a second aspect, the invention provides an isolated polynucleotide 
molecule encoding a human Vitamin D receptor (hVDR). said polynucleotide 

2U molecule comprising a nucleotide sequence which includes sequence that 
substantially corresponds to that of exon If and/or Jc of the human VDR 
gene. 

Exon if is a 207bp exon located more than 9kb upstream from exon la 
(21 bp upstream from exon lc(8). The sequence of exon If is: 

25 

rrGCGACGTTGGCGGTGAGCCTGGGGACAGCGGTGAGGC 
CAGAGACGGACGGACGGAGGGGCCCGGCCCAAGGCGAGGG 
AGAAGAGCGGCACTAAGGCAGAAAGGAAGAGGGCGGTGTG 
TTCACCCGCAGCCCAATCCATCACTCAGCAACTCGTAGAC 
30 GCTCGTAGAAAGTTCCrcCGAGGAGCCTGCCATCCAGTCGT 
GCGTGCAG3' (SEQ ID NO: 5) 

Exon le is a 15 7 bp exon located 182Gbp upstream from exon la (2). 
The sequence of exon le is: 
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S'AGCCAGCATGAAACAGTGGGATGTGCAGAG 
AGAAGArCTGC;(;TC:CAGTAGCTCTGACACTCCTCAGClGT 
AGAAACC'ITGACAACTCTGCACATCAGTTGTACAATGGAA 
CGGlA1TTITTACTC r rTCATGTCTGAAY\A(;(;C7rATGATAA 
AGATCAA3' (SEQ ID NO: bj 

The nucleotide sequence of the polvnucleotide molecule of the second 
aspect of the invention, preferably does not include sequence corresponding 
to that of exon la. id or lb. However, the nucleotide sequence of the 
polynucleotide molecule of the second aspect of the invention, may or may 
not include sequence that substantially corresponds or is functionally 
equivalent to that of exon lc. 

Preferably, the nucleotide molecule of the second aspect comprises a 
nucleotide sequence which includes sequence that substantially corresponds 
or is functionally equivalent to that of exons If and 2-9. 

Most preferably, the polynucleotide molecule of the first aspect of the 
invention comprises a nucleotide sequence substantially corresponding to 
that shown as SEQ ID NO; 7. 

The polynucleotide molecule of the first or second aspects may be 
incorporated into plasmids or expression vectors (including viral vectors), 
which may then be introduced into suitable host cells (e.g. bacterial, yeast, 
insect and mammalian host cells). Such host cells may be used to express 
the VDR or functionally equivalent fragment thereof encoded by the isolated 
polynucleotide molecule. 

Accordingly, in a third aspect, the present invention provides a host 
cell transformed with the polynucleotide molecule of the first or second 
aspect. 

In a fourth aspect, the present invention provides a method of 
producing a VDR or a functionally equivalent fragment thereof, comprising 
culturing the host cell of the first or second aspect under conditions enabling 
the expression of the polynucleotide molecule and. optionally, recovering the 
VDR or functionally equivalent fragment thereof. 

Preferably, the host cell is of mammalian origin. Preferred examples 
include NIH TIM and COS 7 cells. 
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In a preferred embodiment, the VDR or functionally equivalent 
fragment thereof is localised to a cell membrane or other subcellular 
compartment as distinct from a nuclear localisation. 

The polynucleotide molecules of the first aspect of the invention 
5 encode novel VDR isoforins which may be of interest both clinically and 
commercially. By using the polynucleotide molecule of ihe present 
invention it is possible to obtain VDR isoform proteins or functionally 
equivalent fragments thereof in a substantially pure form. 

Accordingly, in a fifth aspect, the present invention provides a human 
10 VDR isoform or functionally equivalent fragment thereof encoded by a 

polynucleotide molecule of the first aspect, said VDR isoform or functionally 
equivalent fragment thereof being in a substantially pure form. 

In a sixth aspect, the present invention provides an antibody or 
antibody fragment capable of specifically binding to the VDR isoform of the 
15 fourth aspect. 

The antibody may be monoclonal or polyclonal, however, it is 
presently preferred that the antibody is a monoclonal antibody. Suitable 
antibody fragments include Fab. F(ab'), and scFv. 

In an eighth aspect, the present invention provides a non-human 
20 animal transformed with a polynucleotide molecule according to the first or 
second aspect of the invention. 

In a seventh aspect, the invention provides a method for detecting 
agonist and/or antagonist compounds of a VDR isoform of the fourth aspect, 
comprising contacting said VDR isoform. functionally equivalent fragment 
25 thereof or a cell transformed with and expressing the polynucleotide 

molecule of the first aspect, with a test compound under conditions enabling 
the activation of the VDR isoform or functionally equivalent fragment 
thereof, and detecting an increase or decrease in the activity of the VDR 
isoform or functionally equivalent fragment thereof. 
30 An increase or decrease in activity of the receptor or functionally 

equivalent fragment thereof may be detected by measuring changes in 
interactions with known cofactors (e.g. SKC-l, GRIP-1 and TFIIB) or 
unknown cofactors (e.g. through use of the yeast dual hybrid system). 

In a ninth aspect, the present invention provides an oligonucleotide or 
35 polynucleotide probe comprising a nucleotide sequence of 10 or more 

nucleotides, the probe comprising a nucleotide sequence such that the probe 
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specifically hybridises to the polynucleotide molecule of the first or second 
aspect under high stringency conditions (Sainbrookei al. t Molecular Cloning: 
a laboratory manual. Second Edition, Cold Spring Harbor Laboratory Press). 
Preferably, the probe is labelled. 
5 In a tenth aspect, the present invention provides an antisense 

polynucleotide molecule comprising a nucleotide sequence capable of 
specifically hybridising to an mRNA molecule which encodes a VDR encoded 
by the polynucleotide molecule of the first or second aspect, so as to prevent 
translation of the mRNA molecule. 
10 Such antisense polynucleotide molecules may include a ribozyme 

region to catalytically inactivate mRNA to which it is hybridised, 

The polynucleotide molecule of the first or second aspect of the 
invention may be a dominant negative mutant which encodes a gene product 
causing an altered phenntype by, for example, reducing or eliminating the 
15 activity of endogenous VDR. 

In an eleventh aspect, the invention provides an isolated 
polynucleotide molecule comprising a nucleotide sequence substantially 
corresponding or, at least, showing >75% (preferably >85% or, even more 
preferably, >95%) sequence identity to: 

20 

(i) 5TGCGAGGTTGGCGGTGAGC( TGGGGACAGGGGTGAGGCCAGAGA 
CGGACGGACGCAGG(;(;(;CCGGCCC/\,\GGCGAGGGAG.\/\CAGCGGCACTA 
AGGGAC;AAAG(;AAGAGGGCGGTGTGTTCACCCGCAGCCCA^VrCCATCAG 
TGAGC/^CTCCl^AGACXCTGGTAG/V^XGT'rCXrrcXlC^CXiAGCCTGCCATC 

25 CAGTCGTGCGTGCAG 3 [exon If) (SEQ ID NO: 5). 

(ii) 57\GGCAGCATGAAACAGTGGGATGTGGAGAGAGAAGATGTGGGTG 
CAGTAGCTCTGACACrCGlX^XC^^CiTAG.AAACCITGACAACTCTGCACAT 
CAGTTGlV\c:AA'r(;c;AAGGGTATTTTTTACTCTrGATGTCTGAAiV\GGCTA 

30 TGATAAAGATGAA3' (exon le) (SEQ ID NO: (5). or 

(iii| 5'G nTCCl'FG rt'G'rc^IXX^GGGGGCCTTGGCATGGAGTGGAGGAATA 

AGAAAAGGAGGGAITGGCTGTGGATCGTGCTCAGAACTGC^ 

GG3' (exon Id) (SEQ ID NO: 1). 

35 
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The polynucleotide molecules of the eleventh aspect may be useful as 
probes for the detection of VDR variant transcripts and as such may be useful 
in assessing cell or tissue-specific expression of variant transcripts. 

The terms substantially corresponds" and "substantially 
5 corresponding" as used herein in relation to nucleotide secpiences is intended 
to encompass minor variations in the nucleotide sequence which due to 
degeneracy in the DNA nude do not result in a substantial change in the 
encoded protein. Further, this term is intended to encompass other minor 
variations in the sequence which may be required to enhance expression in a 
10 particular system but in which the variations do not result in a decrease in 
biological activity of the encoded protein. 

The term "functionally equivalent" as used herein in relation to 
nucleotide sequences encoding a VDR isoform is intended to encompass 
nucleotide sequence variants of up to 5% sequence divergence (i.e. retaining 
15 95% or more sequence identity) which encode VDR isoforms of substantially 
equivalent biological activity(ies) as said VDR isoform 

The term "functionally equivalent fragment" as used herein in respect 
of a VDR isoform is intended to encompass functional peptide and 
polypeptide fragments of said VDR isoform which include the domain or 
20 domains which bestow the biological activity characteristic of said VDR 
isoform. 

The terms "comprise", "comprises" and "comprising" as used 
throughout the specification are intended to refer to the inclusion of a stated 
step, component or feature or group of steps, components or features with or 
25 without the inclusion of a further step, component or feature or group of 
steps, components or features. 

The invention will hereinafter be further described by wav of the 
following non-limiting example and accompanying figures. 

30 Brief description of the figures - 

FIG.l. [A] Human VDR gene locus. Four overlapping cosmid clones were 
isolated from a human lymphocyte genomic library (Stratagene) and directly 
sequenced. Clone J 5 extends from the 5 flanking region to intron 2: AE. from 
35 intron lb to intron 5; D2. from intron 3 to the 3' ITR: WE. from intron b 

through the 3' flanking region. Sequence upstream of exon If was obtained by 
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anchored PCR from genomic DNA. (B) Structure of hVDR transcripts. 
Transcripts 1-5 originate from exon la. Transcript 1 corresponds to the 
published cDNA (1). Transcripts 6-10 originate from exon Id and transcripts 
11-14 originate from exon If. Boxed numbers indicate the major transcript 
5 (based on the relative intensities of the multiple PCR products) within each 
cxon-spccific group of transcripts generated with a single primer set While 
all transcripts have a translation initiation codon in exon 2. exon Id 
transcripts have the potential to initiate translation upstream in exon Id, 
with transcripts (3 and 9 encoding VDR proteins with extended N termini. (C) 
10 NTerminal variant proteins encoded by novel hVDR transcripts. Transcript 1 
corresponds to the published cDNA sequence (1) and encodes the 427-aa 
hVDR protein. Transcripts G and 9 code for a protein with an extra 50 aa or 
23 aa, respectively, at the N-termmal. The 23 aa of the hVDR A/B domain are 
shown in bold. 

15 

FIG. 2. RT-PCR analysis of expression of variant hVDR transcripts. (A) Exon 
la transcripts (220 bp. 301 bp. 342 bp, 372 bp. and 423 bp). [B] 
Exon Id transcripts (224 bp, 305 bp. 340 bp, 376 bp. and 427 bp). (C) Exon If 
transcripts (228 bp. 309 bp. 387 bp. and 4G8 bp). RT-PCR was carried 

20 out witli exon la-, id-, or lf-specific forward primers and a common reverse 
primer in exon 3. The sizes of the PCR products and the pattern of 
bands are similar in A and B by virtue of the identical splicing pattern of 
exon la and Id transcripts and the fact that primers were designed to 
generate PCR products of comparable sizes. All tissues and cell lines are 

25 human in origin. 

FIG. 3. Functional analysis of sequence-flanking exons la and id [A] and 

exon If (B) in NIH 3T3 (solid bars) and COS 7 cells (open bars). 

The parent vector pGL3basic was used as a promoterless control, and a 

30 promoter-chloramphenicol aoetyltransfcrase (CAT) gene reporter construct 
was cotransfected as an internal control for transfection efficiency in each 
case. The activity of each construct was corrected for transfection 
efficiency and for the activity of the pGL3basic empty vector control and 
expressed as a percentage of the activity of the construct la(-488, + 75) 

35 SEM of at least three separate transfections. Exon la and id flanking 
constructs are defined in relation to the transcription start site of exon 
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la, designated 11. which lies 54 nt upstream of the published cDNA (1). Exon 
If flanking constructs arc defined relative to the exon If transcription 
start site, designated 11. Transcription start sites were determined by the 5' 
termini of the longest RACE clones. The open box corresponds to 
5 the GC-rich region. 

FIG 4. Provides the nucleotide sequence of novel exons detected by 5' RACE: 
(A) exon lb (SEQ ID NO: 8). (13) exon If (SEQ ID NO: 5) [Pit is indicated by 
an arrow above the sequence], (C) exon le (SEQ ID NO; 6). (D) exon Id (SEQ 

10 ID NO: 1) (in-frame ATG codons are highlighted and Pld is indicated by an 
arrow above the sequence]. Intronic sequences are shown in lower case. 
Canonical splice site consensus sequences are indicated in bold. The 
transcription start sites for exons If and id were determined by the 5' termini 
of RACE clones. No intron sequence is shown 3 to exon If as cosinid clone 

15 J5 terminated in the intion between exons If and le. 

FIG 5. Provides the nucleotide sequence corresponding to transcript 6 (see 
figure 1) (SEQ ID NO: 2). together with the predicted amino acid sequence 
(SEQ ID NO: 9) of the encoded protein. Nucleotides 1-96 correspond to exon 
20 Id: nucleotides 97-1403 correspond to exons lc to the stop oodon in exon 9 
(or nucleotides -83-1283 of the hVDR cDNA (1)). 

FIG 6. Provides the nucleotide sequence corresponding to transcript 9 (see 
figure 1) (SEQ ID NO: 3), together with the predicted amino acid sequence 
25 (SEQ ID NO: 10) of the encoded protein. Nucleotides 1-96 correspond to 
exon Id: nucleotides 97 - 1382 correspond to exon 2 to the stop codon in 
exon 9 (or nucleotides -2 - 1283 of the hVDR cDNA (t)). 

FIG 7. Provides the nucleotide sequence corresponding to transcript 10 (see 
30 figure 1) (SEQ ID NO: 4). together with the predicted amino acid sequence 
(SEQ ID NO: 11) of the encoded protein. Nucleotides 1-9G correspond to 
exon Id; nucleotides 97-244 correspond to exon 2: nucleotides 245-396 
correspond to intronic sequence immediately 3' to exon 2; nucleotides 397- 
1534 correspond to exons 3 to the stop codon in exon 9 (or nucleotides 146- 
35 1283 of the hVDR cDNA (1)). 
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FIG 8, Provides the nucleotide sequence corresponding to transcript 11 (see 
figure 1) (SEQ ID NO: 7), together with the predicted amino acid sequence 
(SEQ ID NO: 12) of the encoded protein. Nucleotides 1-207 correspond to 
exon If: nucleotides 208-1574 correspond to exon lc to the stop codon in 
5 exon 9 (or nucleotides -83-1283 of the hVDR cDNA (1)). 

Example :- 

EXPERIMENTAL PROCEDURES 

10 

Isolation and Characterisation of Genomic Clones 

A human lymphocyte cosmic library (Stratagcne, La Jolla. Ca) was 
screened using a 2.1kb fragment of the hVDR cDNA encompassing the entire 
coding region but lacking the 3'UTR. a 241 bp PCR product spanning exons 1 

15 to 3 of the human VDR cDNA, and a 303 bp PCR product spanning exons 3 
and 4 of the hVDR cDNA. following standard colony hybridisation 
techniques. DNA probes were labelled by nick translation (Life Technologies. 
Gaithersburg, MD ) with [a 32 P] dCTP. Positively hybridising colonies were 
picked and secondary and tertiary screens carried out until complete 

20 purification. Cosmid DNA from positive clones was purified (Qiagen). 

digested with different restriction enzymes and characterised by Southern 
blot analysis using specific [y 32 P]ATP labelled oligonucleotides as probes. 
Cosmid clones were directly sequenced using dye-termination chemistry and 
automated fluorescent sequencing on an ABI Prism. 377 DNA Sequencer 

25 (Perkin-Elmer, Foster City, Ca). Sequence upstream of the most 5' cosmid was 
obtained by anchored PCR from genomic DNA using commercially available 
anchor ligated DNA (Clontech, Palo Alto, Ca). 

Rapid Amplification ofcDNA 5 -prime Ends (5* -RACE) 
30 Alternative 5' variants of the human VDR gene were identified by 

5'RACE using commercially prepared anchor-ligated cDNA (Clontech) 
following the instructions of the manufacturer. Two rounds of PCR using 
nested reverse primers in exons 3 and 2 (P 1: 5'ccgcttoatgcttcgcctgaagaagcc-3\ 
P2: 5'-tgcagaattcacaggtcatagcattgaag-3') were carried out on a Corbett FTS- 
35 4000 Capillary Thermal Sequencer (Corbett Research. NSW, Australia). After 
2G cycles of PCR. 2% of the primary reaction was reamplified for 31 cycles. 
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The PCR products were cloned into PUC18 and sequenced by the dideoxy 
chain termination method. 

Cell-Culture 

5 The embryonal kidney cell line. HEK-293. an embryonic intestine cell 

line, Iutestine-407 and WS I, a foetal skin fibroblast cell line were all 
cultured in Eagles MEM with Earle's BSS and supplemented with either 10% 
heat-inactivated FBS. 15% FBS or 10% FHS with non-essential amino acids, 
respectively. The osteosarcoma cell lines MG-63 and Saos-2 were cultured in 

10 Eagle's MEM with nonessential amino acids and 1U% heat-inactivated FBS 
and McCoy's 5a medium with 15% FBS. respectively. The breast carcinoma 
cell line T47D and the colon carcinoma cell lines LIM 1863 and COLO 206F 
were cultured in RPMI medium supplemented with 0.2 IU bovine insulin/ml 
and 10% FBS, 5% FBS or 10% FBS, respectively. LIM 1863 were a gift from 

15 R I I. Whitehead (3). I1K-2 kidney proximal tubule cells were grown in 

keratinocyte-serum free medium supplemented with 5ng/ml recombinant 
EGF. 40ug/ml bovine pituitary extract. BCl foetal osteoblast-likc cells were 
kindly donated by R. Mason (4) and were grown in Eagles MEM with 5% FBS 
and 5mg/L vitamin C. Unless otherwise stated all cell lines were obtained 

20 from the American Type Culture Collection (Manassas, VA). 

Reverse Transcriptase-PCR (IIT-PCR). 

Total RNA extracted from approximately 1.5 x in" cells, from 
leukocytes prepared from 40 ml blood, or from human tissue using acid- 

25 phenol extraction was purified by using a guanidium isothiocyanate-cesium 
chloride step gradient. First-strand cDNA was synthesized from 5 of total 
RNA primed with random hexamers (Promega) using Superscript II reverse 
transcriptase (Life Technologies). One-tenth of the cUNA (2f.il) was used for 
subsequent PCR, with 36 cycles of amplification, using exon-specific forward 

30 primers (exon la: corresponding to nucleotides 1-21 of hVTJR cDNA (1); 
exon Id: 5*-GGCTGTCGATGGTGCTCAGAAC-3'; 
exon If: 5 -AAGTTCCTCCGAGGAGC(rrGCC-3 , ) ; 

and a common reverse primer in exon 3 [corresponding to nucleotides 301- 
280 of hVDR cDNA (l)J. All RT-PCRs were repeated multiple times by using 
35 RNA/cDNA prepared at different times from multiple sources. Each PCR 
included an appropriate cDNA-negative control, and additional controls 
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included RT-negative controls prepared alongside cDNA and RNA/cDNA 
prepared from VDR-negative cell lines. PCR products were separated on 2% 
agarose and visualized with ethidiuni bromide staining. 

5 Functional /hiulvsis of h VDR Gene Promoters, 

Sequences flanking axons la. Id. and If (see Fig. IA) were PCR- 
amplified by using Pfu polymerase (Stratagene) and cloned into the 
pGL3basic vector (Promega) upstream of the luciferase gene reporter. 
Promoter-reporter constructs were transfected into NIH 3T3 and COS 7 cells 

10 by using the standard calcium phosphate-precipitation method. Cells were 
seeded at 2.312.5 x 10* per 150-cni 2 flask the day before transfection. Several 
hours before the precipitates were added the medium was changed to DMEM 
with 2% charcoal-stripped FBS. Cells were exposed to precipitate for 1G h 
before subculturing and were harvested 24 h later. The parent vector 

15 pGL3basic was used as a promoterless control in these experiments and a 
simian virus 40 promotcr-chloramphemcol acetyltransferase (CAT) gene 
reporter construct was cotransfected as an internal control for transfection 
efficiency in each case. The activity of each construct was corrected for 
transfection efficiency and for the activity of the pGL3 basic empty vector 

20 control and expressed as a percentage of the activity of the construct 

la(-488, + 75). Luciferase and CAT assays were carried out in triplicate, and 
each construct was tested in transfection at least three times. 

RESULTS 

25 

Identification of Alternative 5' Variants of the hVDR Gene. 

Upstream exons were identified in human kidney VDR transcripts by 5' 
RACE (exons if, le, id. and lb) and localized by sequencing of cosinid 
clones (Fig. 1/1). To verify these results and to characterize the structure of 

30 the 5' end of the VDR gene, exon-specific forward primers were used with a 
common reverse primer in exon 3 to amplify specific VDR transcripts from 
human tissue and cell line RNA (Fig. Hi). The identity of these PCR products 
was verified by Southern blot and by cloning and sequencing. Five different 
VDR transcripts originating from exon la were identified. The major 

3fi transcript (transcript 1 in Fig, IB) corresponds to the published cDNA 

sequence (1). Three less-abundant forms (2, 3. and 4 in Fig. IB) arise from 
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alternative splicing of exon lc and a novel 122-bp exon lb into or out of the 
final transcript. These three variant transcripts were described recently by 
Pike and colleagues (2) A fifth minor variant was identified (5 in Fig. IB) 
that lacks exons lb and In, but includes an extra 152 hp of intronio sequence 
5 immediately 3' to exon 2, potentially encoding a truncated protein as a result 
of an in-frame termination codon in nitron 2. 

Four more transcripts were characterized that originate from exon If, a 
novel 207-bp exon more than 9 kh upstream from exon la The major If- 
containing transcript (11 in Fig. IB) consists of exon If spliced immediately 

10 adjacent to exon lc Three less-abundant variants (12, 13. and 14 in Fig. IB) 
aiise from alternative splicing of exon lc and a novel 159-bp exon le into or 
out of the final transcript. All these hVHK variants differ only in their 5' 
UTRs and encode identical proteins from translation initiation in exon 2. 

Of considerable interest, another five hVDR transcripts were identified 

ir> that originate from exon Id. a novel 96-bp exon located 290 bp downstream 
from exon la. The major exon ld-containing transcript (6 in Fig. IB) utilizes 
exon Id in place of exon la of the hVDR cDNA. Three minor variants (7. 8, 
and 9 in Fig. 1//) arise from alternative splicing of exons lb and lc into or out 
of the transcript, analogous to the exon la-containing variants 2, 3, and 4. A 

20 fifth minor variant transcript (10 in Fig. \H) lacks exons lb and lc, but 

includes 152 bp of intron 2 analogous to the exon la-containing transcripts, 
and also potentially encodes a truncated protein. Two of these exon ld- 
nontaining hVDR transcripts encode an N-tenninal variant form of the hVDR 
protein. Utilization of an ATG codon in exon Id, which is in a favorable 

25 context and in-frame with the major translation start site in exon 2. would 
generate a protein with an additional 50 aa N-tenninal to the ATG codon in 
exon 2 in the case of variant 6 or 23 aa in the case of variant 9 (Fig.lC). 

The relative level of expression of the different transcripts is difficult 
to address with PGR since relatively minor transcripts may be amplified. 

30 However. Southern blots of PGR products from the linear range of PCR 
amplification indicated that equivalent amounts of PGR product were 
accumulated after 26 cycles for exon la transcripts compared with 30 cycles 
for exon id transcripts, suggesting that id abundance is about 5% of that of 
la transcripts. This is consistent with the frequency of clones selected and 

35 sequenced from RAGE analysis of two separate samples of kidney RNA: la 
(21/27:78%). Id (2/27: 7%). and If (4/27: 15%). RT-PCR with exon la-, id-, or 
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lf-specific forward primers and reverse primers in exons 7 or 9. followed by 
cloning and sequencing, suggests that these 5' variant transcripts are not 
associated with differences at the 3' end of the transcript. 

5 Exon-lntron Organization of the h\UR Gene. 

Overlapping cosmid clones were isolated from a human lymphocyte 
genomic library and characterized by hybridization to exon-specific 
oligonucleotide probes (Fig. The exon-intron boundaries of the hVDR 
gene were determined by comparison of the genomic sequence from cosmid 

10 clones with the cDXA sequence. Upstream exons were localized in the VDR 
gene by sequencing cosmid clones, which extend approximately 7 kb into the 
intron between exons le and If. enabling verification of both their sequence 
and the presence of consensus splice donor/acceptor sites. Sequence 
upstream of exon if was obtained by anchored PCR from genomic DNA by 

15 using commercially available anchor-ligated DNA (CU)NTKCIl). In total, the 
hVDR gene spans more than 60 kb and consists of at least 14 exons (Fig. 14) 

Tissue-Specific Expression of h VDR Transcripts. 

The pattern of expression of variant hVDK transcripts was examined by 

20 RT-PCR in a variety of cell lines and tissues with exon la-. Id-, or lf-specific 
forward primers and a common reverse primer in exon 3. Exon la and Id 
transcripts (Fig. IB, variants 1-10) were coordinated expressed in all KNA 
samples analyzed (Fig. 2 A and B). Exon If transcripts (Fig. 1/?, variants 11- 
14). however, were detected only in R\ T A from human kidney tissue (two 

25 separate samples), human parathyroid adenoma tissue, and an intestinal 

carcinoma cell line. LIM 1803 (Fig. 2C). Interestingly, these represent major 
target tissues for the calcitropic effects of vitamin I) 

Functional Analysis of liVDR Gene Promoters. 

30 Promoter activities of the 5' flanking regions of exons la, Id. and If 

were examined in NIH 3T3 and COS 7 cells (Fig. 3). Sequences flanking exon 
la exhibited high promoter activity in both cell lines (Fig. 3A). Maximum 
luciferase expression of 30- and 54-fold over the empty vector was attained 
for construct la(-488. + 75) in NIH 3T3 and COS 7 cells, respectively. This 

35 activity could be attributed largely to a GC-rich region containing multiple 
consensus Spl-binding motifs lying within 100 bp immediately adjacent to 
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the transcription start site. This region alone, upstream of a luciferase 
reporter [construct la(-94. + 75)], accounted for 43% of the maximum activity 
observed in NIH 3T3 cells and 86% of the maximum observed in COS 7 cells. 
The removal of this GO-rich region [construct la(-29. f 75)] reduced luciferase 
5 activity to only 13% of the maximum in NTH 3T3 and 19% in COS 7 cells 
Despite the fact that VDR transcripts that originated from exon Id were 
identified, distinct promoter activity was not associated with sequences 
within 300 bp of exon Id [constructs ld( + 87, 4-424) and ld( 4 244, -f 424)]; 
rather, the sequence immediately adjacent to exon id mav contain a 

10 suppressor element (Fig. 3/1). Construct la-ld(-84(>. + 470). spanning the 5' 
flanking regions of both exons la and id. resulted in only 42% and 60% of 
the activity of la(-898. \ 75) in Nil I 3T3 and COS 7 cells, whereas the 3' 
deletion of 227 bp restored luciferase activity to 65% and 97% of the activity 
of la(-898. + 7f>), respectively. Similarly, the 5' truncated construct la-td (- 

15 94. + 470), spanning the 5' flanking regions of both la and id. resulted in only 
35% and 40% of the activity of la(-94, + 75), while a further 3' deletion of 227 
bp restored luciferase activity to 69% and 91% of the activity of la(-94. + 75) 
in NIH 3T3 and COS 7 cells. It is possible that transcription from exons la 
and Id is driven by overlapping promoter regions rather than from two 

20 distinct promoters, as has been described for the mouse androgen receptor 
gene. 

Sequence upstream of exon if showed significant promoter activity in 
NIH 3T3 cells of 22% of that of the most active construct. la(-488. + 75). or 9- 
fold over pGL3basic [construct lf(-l 168, + 58)] (Fig. 3B). A shorter construct 
25 [lf(-172, + 58)] had similar activity, with evidence of a suppressor element 
(between nucleotides -278 and + 172) able to repress luciferase activity by 
70%. Interestingly, the same constructs were not active in COS 7 cells. This 
cell line-specific activity of exon If flanking sequences may reflect a 
requirement for tissue- or oell-speciHc protein factors. 

30 

Identification of VDR isofvrms in whole cell lysates 

The existence of a VDR isoform including exons id and lc has been 
confirmed in cell lysates from multiple human, monkey, rat and mouse cell 
lines derived from kidney, intestine, liver and bone, by immunoprecipitation 
35 (using the anti-VDR 9A7 rat monoclonal antibody; Affinity Bioreagents Inc., 
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Golden, Colorado) followed by Western blot analysis The id- and lc-exon- 
specific antibodies detected the same band in all immunoprecipitations. 

DISCUSSION 

5 

The present inventors have identified 5 variant transcripts of the 
hVDR that suggest the existence of alternative promoters. These transcripts 
may not have been discriminated in previous Northern analyses because of 
their similarity in size. Transcription initiation from exons ia or if and 

10 alternative splicing generate VDR transcripts that vary in their 5' UTRs but 
encode the same 427-aa protein. Transcription initiation from exon Id and 
alternative splicing generate hVDR transcripts with the potential to encode 
variant proteins with an additional 50 or 23 aa at the N terminus. There was 
no evidence that these 5' variants are associated with differences at the 3' end 

15 of the transcript. Although isoforms are common in other members of the 

nuclear receptor superfamily, the only evidence for isoforms of the hVDR is a 
common polymorphism in the triplet encoding the initiating methionine of 
the 427-aa form of the VDR that results in initiation of translation at an 
alternative start codon beginning at the 10th nucleotide down-stream, 

20 encoding a protein truncated by 3 aa at the N terminus (5). Similarly, two 

forms of the avian VDR. differing in size by 14 aa, are generated from a single 
transcript by alternative translation initiation (G), and in the rat a dominant- 
negative VDR is generated by intron retention (7). 

Heterogeneity in the 5' region is a common feature of other nuclear 

25 receptor genes. Tissue-specific alternative-promoter usage generates multiple 
transcripts of the human estrogen receptor a (ERa). the human and rat 
mineralocorticoid receptors, and the mouse glucocorticoid receptor (GR). 
which differ in their 5' UTRs but code for identical proteins. However, other 
members of the nuclear receptor superfamily have multiple, functionally 

30 distinct isoforms arising from differential promoter usage and/or alternative 
splicing. The generation of N-tenninal variant protein isoforms has been 
described for the progesterone receptor (PR), peroxisome proliferator- 
activated receptor (1TARJ, and the retinoid and thyroid receptors. Some 
receptor isoforms exhibit differential promoter-specific transactivation 

35 activity. The N-tenninal A'B regions of many nuclear receptor proteins 
possess a ligand-mdependent transactivation function (AFl). An AFl 
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domain lias been demonstrated fnr the thyroid receplor b 1 (TRbl). ER. GR. 
PR. PPARg. and the letinoid receptors. The activity of the AFl domain has 
been shown to vary in both a tissue- and promoter-specific manner. The N- 
terminal /VB region of unclear receptors is the least-conserved domain across 
5 the family and between receptor subtypes, varying considerably both in 
length and sequence The VDR, however, is unusual as its N-termmal A/B 
region is much shorter than that of other nuclear receptors, with only 23 aa 
N-terminal to the DNA-binding domain, and deletion of these residues seems 
to have no effect on VDR function. This region in other receptors is 

10 associated with optimal ligand-dependent transactivaliun and can interact 
directly with components of the basal transcription complex. Two stretches 
of basic amino acid residues. RNKKR and RPHRR. in the piedicted amino 
acid sequences of the variant hVDR N termini (Fig lC) resemble nuclear 
localization signals. An N-terminal variant VDR protein therefore might 

15 exhibit different transactivation potential, possibly mediated bv different 

protein interactions, or may specify a different subcellular localization. The 
tissue-specific expression of exon 1 f-containing transcripts is mediated by a 
distal promoter more than 9 kb upstream of exons la and id. Exon If 
transcripts were detected only in kidney tissue, parathyroid adenoma tissue. 

20 and an intestinal cell line. LIM 1863. It is interesting that these tissues 

represent major target tissues for the calcitropic effects of vitamin D. The 
absence of lf-containing transcripts in two other kidney cell lines, HK-2 
(proximal tubule) and HEK-293 (embryonal kidney), as well as one other 
embryonal intestinal cell line. Intestine-407. suggests that the expression of 

25 If transcripts is cell type-specific. The cell line-specific activity of exon If 

flanking sequences in promoter reporter assays may reflect a requirement for 
tissue- or cell-specific protein factors to mediate expression from this 
promoter. 

This study has demonstrated that expression of the human VDR gene, 
30 which spans more than 60 kb and consists of 14 exons. is under complex 
transcriptional control by multiple promoters. The expression of multiple 
exon If transcripts is mediated by utilization of a distal tissue-specific 
promoter. Transcription from a proximal promoter or promoters, generates 
multiple variant hVDR transcripts, two of which code for N-terminal variant 
35 proteins. Multiple, functionally distinct isoforms mediate the tissue- and/or 
developmental-specific effects of many members of the nuclear receptor 
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superfainily. Although the actual relative abundance of the various 
transcripts and their levels of translation in vivo have not yet been 
characterized, the results suggest that major variant isuforms of the hVDR 
exist. Differential regulation of these hVDR gene promoters and of 
r> alternative splicing of variant VDR transcripts may have implications for 

understanding the various actions of 1.25-(()H) 2 D : , in different cell types, and 
variant VDR transcripts may play a role in tissue specific VDR actions in 
bone and calcium homeostasis. 



It will be appreciated by persons skilled in the art that numerous 
variations and/or modifications may be made to the invention as shown in 
the specific embodiments without departing from the spirit or scope of the 
invention as broadly described. The present embodiments are. therefore, to 
15 be considered in all respects as illustrative and not restrictive. 
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f)0 


jaqgggat 


q q a e e e a a t g 


geggeea .^ra 


1 2 0 


-i \ eg t.gee 


• ■■ ■ '■: g' 


?w r q f ?' g 


1 HO 


\ r e a cc7 g 


t gap gg rr g - 


aaaggctt et 


2 '1 0 


leetgeee 


cL t e.-i.icggg 


ija«:tqc"q:*«i 


100 


tq.-egget 


caap rgctgt 


gt'^gaeat rg 


300 


gaagtg<7a 


g-ir:u,i«gegg 


gagatga tec: 


•1 2 0 


\ \ t c* qrq 
1 ^ - • ■ ' - J 


g c. < ~ea a g ct g 


t c 7 q a q q a q e 


400 


: . \ ' :e.i : . i . i 


:; 3' el aega :; 


ecea cctart 


0 


g t a a 7 g a 


tgetggaggg 


ageca t e rt t 


GO(t 


Letgqgga 


rt.eei.eet ec: 


t e e 7 g r t e a q 


0 6 0 


tcctccag 


rttctccaat 


etggatetga 


V 2 ( ) 


e! ecaget 


7 t ee-ag-tr 


tccatgctgc 


7 8 0 


eaaaaggt 


rattggcttt 


get aa g a t ga 


8 ( i 


ragategt 


act gcrgaa g 


t caagtgeca 


9 0 0 


t r race at 


^g-acgacatg 


tectgg vr- t 


r -» 0 1 1 


ga rgtgac 


:aaagcegga 


eaeagect ^g 


1"20 


gg.-i etg.i a 


gaage* qaa - 


ttgeatqi -jq 


1 MHO 




3 g a 7 e g t r c t 


ggggtge.^g 


1 1 o 


• q q 1 era a 


ca<~a et g "a j 


aegtae atee 


L2o:i 


-tetargc 


rar : qa 7g.it : 


cagaaget ig 


12t)0 


nne^aqr a 


r "qet ge :K r 


t ect t eea ;c 


1320 


cr rgaagt 


r-t t ggca at 


gagat et r :t. 


I <80 
L3H2 



ca it -gig 


g i 


a t a a g i a a 


a g g a g e g a 1 1 


0 0 


gaggggit 


gg 


aggea-it j 


geggeeaq ca 


120 


eaegtgrc 




■a a 1 1 e t q t 


ggggtgtgtg 


100 


atgaert j 


tg 




aaaggct t :t 


2 "i '') 


t g gaaigg 


gigggagia g 


a agcaaggtg 


3 0 0 


t:etert. tc 




: aeaa t g t 


e rat g gaa<-a 


3 -'v.) 


cttggrga 


a ~\ 


eatga age 


ggaaggcaet 


■1 2 0 


mc71iq 




-aaee-ae 


gcea 7tgeea 


4 3 0 


qcatgatg 




ggagttea 


1 1 rtgacaga 


jI'I 


tgaagr^g 


a a 


ggiggaga 


a gge 7 1 tea a 


60 0 


ag.rag.4c 




-a ttgeca 


ta rtget gqa 


n 0 '') 


ecgart te 


1 1 


eeagtt e r 


gg ret ecaet 


720 


e^.iggr 


a a 


i q - - a a a " 


aca et cccag 


7^0 


at ea rt jt 


at 


ea 7 7tct t 


caga ra tea t 


3 4 0 


gt gaag la 


a a 


1 1 rag itg 


ac rcttctgt 


3 -JO 


eeearrtg 


' 


t garet gq 


reagt t aeag 


3n0 


race a g :ja 


tt 


ea 4agac :: 


t c ae rtctga 


102 0 


ttg.iggt e 


a * 


-•a: gt 1 ge 


get e rant 


1 J30 


gt ggeaac 


r a 


a a =i rt ica 


a g t a r c g e j t 


114 0 


agrr jat t 


qa. 


'!'"" ^7 t C 7 


t eaa g 1 1 :ca 


12 00 


a g j a g r a t 


?* 


■rrtget -r a 


tggeratet t\ 


1 2 00 


a ,-qr-nrq 




a a 7 t gage 


c c a t e c a e g a 


132 0 
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ccgcctgtce 
cctgctctat 
c t ccaagcag 
t g t g • : t: c ( j . i < i 



aacacact 7 c 
gcca d(j,i t_ j , 
taccgctgcc 
gt qf-tggca 



agacgtacat 3 c g :: t g c c g c 

t f 'frigaa gc t a gc c g a c c t g 

tctccttccd g c c t g a g t . g <. : 

at ga gat etc ctga 



cm- ccgcccc cgqqcagcca 138 0 
:g " i^c::ca at:jaggagca 14 4 0 
.nj" 1: (i.iiii|f: t a irqccc^t 1500 

15 34 



[ D MO: 
11- 2 •')',' 
12 • DMA 
I V- Homo 



;npiens 



■■40'.) • 1 

t gegacett g 
•■ ■^g«:cr -"a a 
■■iccjgcag-: 



gcugegag-' t qqqqacaqq gg t ga ggeea gag a eg gaeg 
qgegagggag aacageggea rT.aaggcaga aa 1 ] i iinn 
ceaa - . cat e artcagcaac tccta^acgc eg gt iqaaag 



-lagcetgcca tecagtegtg cgtgcag 



^acjcagggg 60 
.}:• 1 :t qt qt t 15 0 
tte-rtcegag 1»'<0 
2 07 



'S KQ 1 D NO : 
-.: 11 • 157 
■:. Lz • DMA 
•513 • Hon.o 



sapiens 



• i0u ■ o 
i.jge.tg catq 
cctcagctgt 
r .,er cr r cat 



a a a cagtggg 
agaaac j 1 1 g 
at ctga a aaq 



atgtgcagag agaagatctg ggtccagtag 
aea.ict ?t gc acatcagttg tacaatggaa 
g c t a t g a t a a a g a t c a a 



:t-.:t.qacact 6): 
rggtattttt 150 
l c -7 



r.EQ ID MC: 
• 1574 
■ :. 1 ;"■ • PMA 
i 3 • iior.c 

4 00 - 7 

' -J M.tC! t q 

- •• - -gqcc c ^a 

gagectg : :a 
acagaagagc 
tg j eg g e c a g 
at jgggt ?tg 

u,:.K:,iqq - ft 
gggaetge eg 
atgtgga rat 
aggaqatgat 
t gtctgagga 
.icccea c: :t a 
ggagecat :c 

- :ct.C' :t get c 
a tetgga t et 
r cte rat 5c t 
ttqetdd ^at 
agt caagtgc 

tqt.e :tq jar 
;aca zag :ct 
r.rt rqearga 
ctggagt jca 
a g a e a t a r a t 
tceagaa j c : t 



2 a pa ens 



g' -qq*: g vjee 
ggegagggag 
ccaat: : a t 
tccagt jgt a 
aeccet jggc 
cart re : "te 
tggaga eg a 
et . cag 3 rga 
eat caeca a g 
eggeat 9 at g 
cc: gaagegg 
geag zagege 
rtccgact t c 
tteeag gece 
agat eaetg t 
gagtgaagaa 
gccecacctg 
qa caeca gga 
eatt g a gate 
r- g t g g e a a c 
ggagetgaf: 
ggaggagcat 
gqaegccgcj 
ccget ge rage 
a (if cjaeet a 



tggggaeagg 
a a 1 /aqeggca 
act eagcaac 
r:q' qragaag 
r. ccactta cc 
rrt gaccet g 
gecactggrt. 
ageatgaagc 
qaca acegac 
aagg agt tea 
aaggag gagg 
a tea t tg cca 
tg rcagntce 
aactecaqa e 
atcaeet rtt 
qat teagatg 
gctga eetgg 
1 1 cagagacc 
at jijt.gt t.tje 
eaaga zt aea 
gageeeer r,i 
gt cctgetea 
el gat tgagg 
eaccegcce.: 
g c a a c c t e a 



ggt jagg cca 
ct aagg caga 
t c ta gauge 
ect t tgggtc 
tgcccco' gc 
gagactttga 
1 1 i- ict t eaa 
ggaagg eaet 
geeaetgeea 
1 1 et qacaqti 
aggecttgaa 

t a et qetqqa 
ggc etceagt 
a raef e eeag 
caga eatgat 
acectteegt 
tcag t ta eaq 
t eaeet erga 
get reaatga 
agt a eegegt 
tcaacttcea 
tgeceatct g 
cca t eeagaa 
egg(;ea(ie^a 
a t qaggag ca 



ga a a 
a a g g 
tggt 

t g a a 
t era 

z z a g 

t _r '-t 

at 1 1 

ggee 

Lgag 

gga 

cgee 

t rgt 

rttc 

J^ae 

gaer 

-it e 

3 3ac 

3 tee 

eaa t 

•3etg 

e it e 

-r-t n 
r t c e 



eg ja eg 
aa gagg 
agaaag 
.;tgt et 

r -aqqq 

aaegtg 
atga-r 
aeetgc 
tgeegg 
agtg 
agtctg 
eae cat 
gt gaa t 
tetggg 
tegt. -.;«.: 
cca gag 
eaa a 1 q 
c a g a t c 
1 1 e a e c 
aaegtg 
gqactg 
V. ■ ' * e:- 
etgtcc 
■-r et at 
a a q :• a g 



J J --geagggg 
3 rggtgtgtt 
L t. cctccgag 
gtgagaeet e 
atggaggcaa 
c eeeggatet 
t g tgaagget 
r.eet t eaa • 1 
= tea a a eg ct 
caga gga a g c 
;gg..:ecaage 
aagaccta eg 
^atggtggag 
gartectcct 
ag.-! 1 et c r,i 
e t g t e c e a g e 
Iteactggct 
gtaetgetga 
atggacgaca 
aceaaage-g 
aagaagct g a 
ceaqat cgte 
2?.crnza zt ge 
gceaaqat ga 
taccg ct-g-.v 



hi 

1 ;:■ 0 

1P0 
24 0 
31. 0 
3<-.0 
450 
4P0 
5 '1 0 
60 0 
6h0 
75 0 
7R0 
840 
90 0 
900 
102 0 
10 8 0 
1140 
150 0 
1560 
1320 
i 3 HO 
14 40 
1 500 
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tc':cc:t::a gcctyag t. gc r.arar^agc taacgcccc: tgt.gctcgaa gtgoooggcu 1560 
at<ui(j«it. "t 7 ctga 1574 



liKg ID NO: e 
<211- 122 
12 ■ S>NA 
-213 > Hono sapiens 

- 4 00 • 8 

;) jet -:t:;.(ja,i rrtagcccag ctggaoggag aaatggactc td-.ji:t:t cc:t o tgntagcctc 60 

atgcraqgcc ccgtgca;:a t tqrrrtgctt gcctrcctca at -.-tea tag cttctctttg 

'■I i 122 



r»r:v ID NO: 9 
<:H1- 477 
•:712 s ?HT 

-:.!13 Homo sapiens 
•:4CC' 9 

Ket Glu Trp A: (j Asa I.ys Lys Arg Ser Asp Trp le.i Ser Met V.j 1 T, ( mi 
1 5 lb 15 

Arg Thr Ala Gly Val G : u Glu Ala The Gly Ser Glu Val Her Val Arg 

20 25 30 

Pro His Arg Arg Ala Pro Leu Gly Ser Tar Tyr Lea Fro Pro Ala Pro 
3 5 -10 4 5 

lS"t Gly Net Glu Ala M-t Ala Ala Ser Thr Ser Leu Pro Asp Flo Gly 

50 55 60 

Asp Phe Asp Arg As n Val Pro Arg Tin Cys Gly Val Cys Gly Asp Arg 
o5 7Q vr: PO 

Ahi Thr Gly The His Phe Asn Ala Met Thr Cys Glu G L y Cys Lys Gly 
Ha 9 0 9 5 

Phe Phe Arg Arq Ser Mot Lys Arg Lys Ala Leu Phe Thr Cys Pt a Fhe 
100 105 110 

Asn Gly Asp Cys Ar cj Tie Thr Lys Asp Asn Arg Arg His Cys Gin Ala 
115 12 0 17 5 

Cys Arg Leu Lys Arg Cys Val Aso He Gly Mot Mo: Lvs Glu Phe He 
130 135 * 140 

Leu Thr Asp Glu Glu Val Gin Arg Lys Arg Glu Met lie Leu Lys Arg 
145 150 155 160 

Lys Glu Glu Glu Ala Leu Lys Asp Ser Leu Arg Pro Lys Leu Ser Glu 
16 3 17 0 1 -7 5 

Glu Gin Gin Arg He He Ala lie Leu Lei Asp Ala His His Lys Thr 
IRC 105 ' luij 

Tyr Asp Pro Thr Tyr Ser Asp Phe Cys Gin Phe Ar-j Pro P-o Val Arq 

195 2 0 0 2 : J 5 
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Val Asn Asp Gly Gly Gly Ser His Pro Ser Arq Fro Asn . c er Arg H i s 

210 215 22 n 

Thr Fro Ser Pno Ser Gly Asp Ser Ser Ser Ser Cvs Ser Asp His Cys 
225 vjj 235 " ^ 240 

lie Thr Ser Ser Asp Met. Met Asp Ser Ser Ser Phe Ser Asn Leu Asp 

24- 250 :\V> 

Leu Ser Glu Gin Asp Ser Asp Asp Pre Ser Val Tk: Leu G:u Leu Ser 

2G0 2h5 2"? C 

Gin Leu Ser Met Leu Pro H n Leu Ala Asp Leu Val Ser Tyr Ser lie 

28 0 ' 285 

Gin Lys Val lie Sly Phe Ala Lys Met lie Pro C • y Phe Arq Asp Leu 

290 295 3C0 

Thr Ser Glu Asp Sir: Tie Val Leu Leu Lys Ser Ser Ala Me Glu Val 
305 310 315 320 

lie Met Leu Arg Ser Asn Glu Ser Phe Thr Met Asp Asp Met Ser Trp 
325 J30 ' 335 

Thr Cys Gly Asn Gin Asp Tyr Lys Tyr Arq Val Ser Asp Val Thr Lys 
310 345 350 

Ala Gly His Ser Leu Glu Leu He Glu Fro Leu He Lys Phe Gin Va j 
3 5 5 3 6 0 3 6 5 

Gly Leu Lys Lys Leu Asn Leu His Glu Glu Glu His Val Leu Leu Met 
3' ; 0 375 381 

Ala Tie Cys He Val Ser Pre Asp Arq Pro Gly Va : Gin Asn Ala Ala 
335 390 395 

Leu He Glu Ala lie Gin Asp Arg Leu Ser Asn Thr Leu Gin Thr Tyr 
4 0 5 4 10 4 15 

He Arg Cys Arg His Pro Pro Pro Gly Ser His Leu Leu Tvr Ala Lys 
420 425 4 30 

Met Tip Gin Lys Leu Ala Asp Leu Arg Ser Leu Asn Glu Glu His Ser 
435 44 0 445 

Lys Gin Tyr Arg Cys Leu Ser Phe Gin Pro Glu Cvs Ser Met Lys Leu 
450 455 460 



Thr Fro Leu Val Leu Glu Va] Phe Cly Asn Glu 
4 6!) 4 70 475 
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CEO ID NO: 10 
<;U1> 4 SO 
<212> PRT 

< 2 1 3 > Homo s a p i n n 5 
<.|(H)> 10 

Met Glu Trp Arq Asn Lys Lys Arc] Ser Asp Trp Leu :>r Met Val Lpu 
1 5 1J IS 

Arc] Thr Ala Gly Val Glu Gly Met Glu Aid Ket AH Ala Ser Thr Ser 

2 0 2 5 3 0 

Leu Pro Asp Pro ~ly Asp Phe Asp Arc Asa Val Fro Arq He Cys Gly 
3 5 -1 ) -15 

Val Cys Gly Asp Arg Ala 'Ihr Gly Phe His The Asa Ala Met Thr Cvs 

^0 5 5 60 

Glu Gly Cys Lys Sly Phe the Arg A: ;j Ser M-t I.yr. Arg Lys Ala Leu 
C5 7 0 -?5 m 

Phe Thr Cys Pro rhe Asa G ■ y Asp Cvs Arq lie Thr Lys Asp As .a Atq 
6 5 m (i 95 

Arg His Cys Gin Ala Cys Arc > u lys Arq Cys Val Asp He Gly Met 
* '> 1 105 lit; 

Met Lys Glu Phe Mr Leu Thr Asp Glu Glu Val Sir. Arq I.ys Arq Glu 
115 ISO 125 

Met lie Leu Lys Arg Lys Glu Glu Glu Ala Leu Lvs Asp Ser Leu Arq 
13C 135 Mi 

Tro Lys Leu Ser Glu Glu Gin Gin A; ) " I c rie Ala lie L-u Leu Asp 
1^5 ISO 155 1 60 

Ala His His Lys Thr Tyr Asp Fro Thr Syr ier Asp Phe Cys Gin Ph^ 
1G5 170 17 5 

Arq Pro Fro Val Arq V« I Asn Asp Gly Gly Gly Ser His Pro Ser Arq 
130 lBh 193 

Fro Asn Ser Arq His Thr Pi o Ser Phe Ser Gly Asp Ser Ser Ser Ser 
1^5 200 ' /O.s 

Cys Ser Asp His Cys He Thr Ser Ser Asp Met Met Asp Ser Ser Ser 

210 CIS 220 

Fhe Set Asn Leu Asp Leu Ser Glu Glu Asp Ser Asr Asp Pro Ser Val 
225 230 235 240 

Thr Leu Glu Leu Ser Gin .Leu Ser Met Leu Pro His Leu Ala Asp Leu 
24 5 C'.p 255 

Val Ser Tyr Ser He Gin Lys Val Tie Sly Phe All Lys Met He Fro 

2 0 5 L70 

Gly Phe Ac:] Asp Leu Thr Ser Glu Asp Sir. He Val Leu Leu I vs Ser 
2 75 28 0 2 p 5 
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Ser Ala lie Glu Vu ' Tin Met Leu Arg Ser Asn G: u Ser Phe Thr Ket 

290 2?5 3Cf 

Asp Asp Met Ser Tro l'hr Cys Gly Asn G n Asp Tvr Lys Tvr Arg Val 
305 310 315 320 

Ser Asp Va 1 Thr Lys Ala Gly His Ser Leu Glu Leu lie Glu Pro Leu 

325 330 335 

lie Lys Fhe Gin Val Gly Leu Lys Lys Leu Asn lev. His Glu Glu Glu 

34 0 3 4 3 350 

His Val Leu Leu Ket Ala lie Cys lie Val Ser P r u Asp Arg Pro Gly 

355 360 363 

Val Gin Asp Ala Ai a Leu lie Glu Ala He Gin Asp Arq Leu Ser Asn 
3^0 375 3R0 

Thr Leu Gin Thr Tyr He Arg Cys Arg His Pro Prs Fro Gly Ser His 

3>3 5 3 90 3 95 4 00 

Leu Leu Tyr Ala Lys Met He Gin Lys Leu Ala Asp Leu Arg Ser Leu 
4 05 410 4 15 

Asn Glu Glu His Ser Lys Gin Tyr Arg Cys Leu Ser Fhe Gin Pro Glu 
420 423 430 

Cys Ser Met Lys Leu Thr Pro Leu Val Leu GLu Val Fhe Gly Asn Glu 
4 33 440 443 

He Ser 
450 



SEQ T D NO : 11 

<::ii> 72 
<: i2> PRT 

<213> Hor.o sapiens 
<400> H 

Met Glu Trp Arg Asn Lys Lys Arq Ser Ann Trp Leu Ser Met Val Leu 
1 5 lb is 

Arg Thr Ala Gly Val Glu Gly Met Glu Ala Met. Al i Ala Ser Thr Ser 
20 25 30 

Leu Pro Asp Pro Gly Asp The Asp Arg Asn Val Pro Arq He Cvs Gly 
3 5 4 0 4 5 

Val Cys Gly Asp Arg Ala Thr Gly The His Phe Asa Ala Met Ihr Cys 

3 0 5 5 o) 

Glu Gly Cys Lys Gly Phe Phe Aru 

05 7 0 
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5EQ ID NO: 12 

-:;M1> 4 27 
:21Z> PRT 

<213> Home sapiens 
<A00> 12 

Met Glu Ala Met. Ala A ; a Scr Thr Ser Leu Pro Asp Pro Gly Asp Phe 
1 5 10 IS 

Asp Arg Asn Val Pro Arc; lie Cys Gl y Val Cys Gly Asp Arg Ai a Thr 

2i> 2 5 30 

Giy ?he. His Phe Asn Ala Met Thr Cys Glu Gly Cys Lys Gly Fhe Phe 
3 5 4 0 4 C 

Arg Arg Ser Met Lys Arc; Lys Aln Leu The Thr Cys Pro Phe Asn Gly 
5 0 S5 »'i 

Asp Cys Arg lie Thr Lys Asp Asn Arq Arg His Cys Gin Ala Cys Arg 

0 5 7 0 7 5 8 0 

Leu Lys Arg Cys Val Asp He Gly Met Met Lys Glu Phe He Leu Thr 

8 5 3 0 0 5 

Asp Glu Glu Val Gin Alu Lys Arq Glu Met He Leu Lys Arg Lys Glu 
100 105 11 0 

Giu Glu Aln Leu Lys Asp Ser Leu Arg Pro Lys Leu Ser Glu Glu Girt 

115 12C ' 125 

Gin Arg He He Ala Tie Leu Leu Asp Ala His His Lys Thr Tvr Asp 
130 135 I AC) 

Fro Thr Tyr Ser Asp Phe Cys Gin Phe Arg Pro Pro Val Arg Vnl Asn 
:<15 150 155 160 

Asp Gly Gly Gly Ser His Pro Ser Arg Pro Asn Ser Arg His Thr Pro 
16 5 L 7 o 17 5 

Sei Phe Ser Gly Asp Ser Ser Ser Ser Cys Ser Asp His Cys He Thr 
180 185 100 

Ser Ser Asp Met Met. Asp Ser Ser Ser Phe Ser Asn Leu Asp Leu Ser 

195 200 205 

Glu Glu Asp Ser Asp Asp Pro Ser Val Thr Leu Glu Leu Ser Gin Leu 
210 215 22 0 

Ser Met Leu Pro His Lou Ala Asp Leu Val Ser Tyr Ser He Sin Lys 
225 230 235 ' 240 

Val He Gly Phe Ala Lys Met Tie Pro Gly Phe Arg Asp Leu Thr Ser 

2 4 5 > 5 C 2 55 

Cm; Asp Gin He Val Leu Leu Lys Ser Ser Ala II? Glu Val Tie Met 

2o0 ? 65 27 J 

Leu Arg S-r Asn Glu Ser Phe Thr Met Asp Asp Met Ser Trp Thr Cys 

2 7 5 28 0 26 5 
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Gly Asn Gin Asp Tyr Lys Tyr Arg V« 1 Set Asp Val Thr Lys A: a G] y 

290 295 300 

His Ser Leu Glu Leu lie Glu Pro Leu lie Lys Phe Gin Val Gly Leu 

305 3)0 315 32 0 

Lys Lys Leu Asn Lou H:s Glu Glu Glu His Val Leu Leu Met Ala lie 

325 5 ,30 335 

Cys lie Val 5er Pro Asp Arg Pro Gly Val Gin Asp Ala Ala Leu Tie 

3 4 0 34 5 3 50 

Glu Ala Tie Glr: Asp Arc Leu Sec Asn Thr Leu Gin Thr Tyr lie Arg 

355 3 6 U 3fc': 

L'ys Arg His Pro Pro Flo Gly Ser His I.eu Leu Tyr Aln Lys Met lie 

37 0 37 5 330 

Gin Lys Leu Ala Asp Leu Arg Ger Leu Asn Glu Glu his Ser Lys Gin 

385 3r'j 3 C '5 400 

Tyr Arg Cys Leu .Ser Fhe Gin Pro Glu t'ys Ser Met. Lys Leu Thr Pro 

4 05 4 10 415 

Leu Val Leu Glu Val The Gly Asn Glu lie Ser 

4 2 0 " 4 2 5 
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Claims:- 

1. An isolated polynucleotide molecule encoding a human Vitamin D 
receptor (hVDR) isoform, said polynucleotide molecule comprising a 

5 nucleotide sequence which includes sequence that substantially corresponds 
or is functionally equivalent to that of exon id of the human VDR gene. 

2. A polynucleotide molecule according to claim 1, wherein said 
nucleotide sequence further includes sequence thai substantially 

10 corresponds or is functionally equivalent to that of exon lb and/or exon lc. 

3. A polynucleotide molecule according to claim 1, wherein the 
nucleotide sequence includes: 

(i) sequence that substantially corresponds or is functionally 

15 equivalent to that of exons id, lc and 2-9 and encodes a VDR isoform of 
approximately 477 amino acids, 

(ii) sequence that substantially corresponds or is functionally 
equivalent to that of exons Id and 2-9 and encodes a VDR isoform of 
approximately 450 amino acids, or 

20 (iii) sequence that substantially corresponds or is functionally 

equivalent to that of exons Id and 2-9 and further includes a 152bp intronic 
sequence and encodes a truncated VDR isoform of approximately 72 amino 
acids. 

25 4. A polynucleotide molecule according to claim 1, wherein the 

nucleotide sequence substantially corresponds to that shown as SEQ ID NO: 
2, SEQ ID NO: 3 or SEQ ID NO: 4. 

5. An isolated polynucleotide molecule encoding a human Vitamin D 
30 receptor (hVDR), said polynucleotide molecule comprising a nucleotide 
sequence which includes sequence thai substantially corresponds or is 
functionally equivalent to that of exon If and/or lc of the human VDR gene. 

0. A polynucleotide molecule according to claim 5. wherein the 
35 nucleotide sequence further includes sequence that substantially 
corresponds or is functionally equivalent to that of exon lc. 
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7 A polynucleotide molecule according to claim 5. wherein the 
nucleotide .sequence includes sequence that substantially corresponds or is 
functionally equivalent to that of exons If and 2-9. 

5 

tf. A polynucleotide molecule according lo chum 5, wherein the 
nucleotide sequence substantially corresponds to that shown as SKQ ID NO: 
7. 

10 9 A plasinid or expression vector including a polynucleotide molecule 
according to any one of the preceding claims. 

10. A host cell transformed with a polynucleotide molecule according to 
any one of claims or a plasmid or expression vector according to claim 9. 

15 

11. A host cell according to claim 10, wherein the cell is a mammalian 
cell. 

12. A host cell according to claim 10. wherein the cell is a NIH 3T3 or COS 
20 /cell. 

13. A method of producing a VDK or VI )R isofonn or functionally 
equivalent fragments thereof, comprising culturing a host cell of any one of 
claims 10-12 under conditions enabling the expression of the polynucleotide 

2H molecule and, optionally, recovering the VDK or VDK isoform or functionally 
equivalent fragments thereof. 

14. A method according to claim 13, wherein the VDR nr VDR isofonn or 
functionally equivalent fragments thereof are expressed onto the host cell 

30 membrane or other sub-cellular compartment. 

If). A human Vitamin D receptor (hVI)R) isoform or functionally 
equivalent fragment thereof encoded by n polynucleotide molecule according 
to any one of claims 1-1. said hVDR isoform or functionally equivalent 
35 fragment thereof being in a substantially pure form 
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16. An antibody or antibody fragment capable of specifically binding to a 
VDR isofonn according to claim 15. 

17. A non-human animal transformed with a polynucleotide molecule 
5 according to any one of claims 1-8. 

18. A method for detecting agonist and/or antagonist compounds of a VDR 
isofonn of claim 15. comprising contacting said VDR isofonn. functionally 
equivalent fragment thereof or a cell transformed with and expressing a 

10 polynucleotide molecule according to any one of claims 1-4, with a test 

compound under conditions enabling the activation of the VDR isofonn or 
functionally equivalent fragment thereof, and detecting an increase or 
decrease in the activity of the VDR isofonn or functionally equivalent 
fragment thereof. 

ir> 

19. An oligonucleotide or polynucleotide probe comprising a nucleotide 
sequence of 10 or more nucleotides, the probe comprising a nucleotide 
sequence such that the probe specifically hybridises to a polynucleotide 
molecule according to any one of claims 1-8 under high stringency 

20 conditions. 

20. An antisense polynucleotide molecule comprising a nucleotide 
sequence capable of specifically hybridising to a mRNA molecule which 
encodes a VDR or VDR isofonn encoded by a polynucleotide molecule 

25 according to any one of claims 1-8, so as to prevent translation of the mRNA 
molecule. 

21. An isolated polynucleotide molecule comprising a nucleotide sequence 
showing greater than 75% sequence identity to: 

30 

(i) 5tgggagcttcgcggtgagcctggggacaggggtgaggccagaga 
cggacggacgcaggggcgcgggcgaaggggagggagaagagcggcjacta 
aggcagaaaggaagagggcggtgtgttcacccgcagcccaatccatcac 
tcagca,^ctcctagacgctggtag/\,\y\gttcctcc;gaggagc;ctgcca r rc; 
35 cagtcgtgcgtgcag3' (skq id no: 5) 
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(ii) sagggaggatgaaacagtgggatgtggagagagaagatgtgggtg 

GAGTAGCTCTGACACTCCTCAGCTGTAGAAACC ITGACAACTGTGCACAT 
CAG1TGTAC^1'GGA.ACGGTATTTTTTACTCTTCATGTCTG/\A'\ J '\GGCTA 
TGATAAAGATCAA3 (SEQ ID NO: (5), or 

5 

(in) 5'GTITGC'ITCTTGTGTGGGGGCGCCnTGGGATGGAGTGGAGGAATA 
AGAAAAGGAGCGATTGGCTGTCGATGGTGCTCAGAACTGCTGGAGTGGA 
GG3' (SEQ ID NO: 1) 

10 22. An isolated polynucleotide molecule comprising a nucleotide sequence 
showing greater than 85% sequence identity to: 

(i) 5TGCGACCTTGGCGGTGAGCCTGGGGACAGGGGTGAGGCCAGAGA 
CGGACGGACGGAGGGGCCGGGCCGAAGGGGAGGGAGAACAGCGGGAGTA 

15 AGGCAGAAAGGAAGAGGGCGGTGTGTTCACCCGCAGCCXAATGCATCAC 
TCAGCAACTCCTAGACGCTGGTACAAAG'1TCCTC;CGAGCAGCCTGGCATC 
CAGTCGTGCGTGCAG3 " (SEQ ID NO: 5) 

(ii) 5'AGGGAGCATGAAACAGTGGGATCTGCAGAGAGAACATCTGCGTC 
20 CAGTAGCTCTGACACTCCTCAGCTGTAGAAACCTTGACAACTCTGCACAT 

CAGTTGTACAATGGAACGGTArmTI\\ClXrrit:ATG'I'CI'GAAA/\GGCTA 
TGATAAAGATCAA3' (SEQ ID NO: B), or 

(iii) 5'GTTTCCTTCTTCTGTCGGGGCGCCTTGGGATGGAGTGGAGGAATA 
25 AGAAAAGGAGCGATTGGCTGTCGATGGTGCTCAGAACTGGTGGAGTGGA 

GG3' (SEQ ID NO: 1). 

23 An isolated polynucleotide molecule comprising a nucleotide sequence 
showing greater than 95% sequence identity to: 

30 

(i) 5'tgcgaccttcgcggtgagcctggggacaggggtgaggccagaga 
cggacggacgcaggggcccggcccaaggggagggagaagagtlgggacta 
aggcagaaaggaagaggggggtgtgttcacccgcagcccaatccatcac 
tgaggaagtcctagaggctggtaga<\agttcgtccgacgaggctgccatc 
35 c agtcgtgcgtg c ag 3' (seq id no: 51 
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(ii) 57VGGCAGCATGAAACAGTGGGATGTGGAGAGAGAAGATCTGGGTG 
GAGTAGCTCTGACAGTCGTGAGGTGTAGAAA(X;TTGA(:AA( H CTGCAGAT 
CAGrrGTACAyVl'GGAA(J(;(rrATTTTlTAC:rcnTC:ATG'rCTGAA'V\GGCTA 
TGATAAAGATCAAS' (SEQ ID NO: 6). or 

5 

(iii) 5'GTTTCGTTCTTCTGTCGGGG(XXXnTGGCATGGAGTGGAGGAATA 
AGAA^VVGGAGCGyVI'IXXiCrrCilXXiAl'CIG'rGCTCACJAACn^GCTXiGACn'GGA 

GG3' (SEQ idno: i) 

10 24 An isolated polynucleotide molecule comprising nucleotide sequence 
substantially corresponding to: 

(i) 5tgcgagcrrgggggtgagcctccgcacagcggtgaggggagaga 
cggacggacgcacggggccggccca'\ggggagggaga<\cagggggagta 

15 acggagaaaggaagaggggggtgtgrn:a(xx:ggaggcgaatcgatgag 
tcagcaacritxnvuiagggtggtagaaagttcctgcgaggagcctgccatc 
gagtcgtgcgtgcag 3' (seq id no: 5) 

(ii) sagggagcatgaaagagtcggatgtgcagagagaagatctgggtc 
20 cag'iaggtgtgagactggtcagctgtagaaaggttgacaagtctgcagat 

CAG'ITGTACAAl'GG/^CGG'rATTTTrrAC'l-C'rTCATGTCTG^^GGG'rA 
TGATAAAGATCAA3' (SEQ ID NO: 6). or 

(iii) 5GTnCX:rrG"ITGTGTCGGGGCGGGTTGGGATCGAGTGGAGGAATA 
25 AGAAAAGGAGCGATTGGCTGTCGATGGTGCTCAGAAGTCCTGGAGTGGA 

GG3'{SEQ ID NO: 1) 



30 
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5' . . . atccctt aag GGCTCCTGAACCTAGCCCAGCTGGACGGAG 
AAATGGACTCTAGCCTCCTCTGATAGCCTCATGCCAGGCCC 
CGTGCACATTGCTTTGCTTGCCTCCCTCAATCCTCATAGCT 
TCTCTTTGGGg taagtacag...3' 



B. 



5'...TGCGACCTTGGCGGTGAG(XTGGGGACAGGGGTGAGGC 

GAGAGACGGACCGACGCAGGGGCCCGGCCCAAGGCGAGGG 

AGAACAGCGGCACTAAGGCAGAAAGGAAGAGGGCGGTGTG 

TTCACCCGC AGCCCAATCCATCACTCAGCAA CTCCTAGAC 

GCTGGTAGAAAGTTCCTCCGAGGAGCCTGCCATCCAGTCGT 
GCGTGCAG...3' 

5 ' . . . t g 1 1 1 1 1 tag AGGCAGCATGAAACAGTGGGATGTGCAGAG 
AGAAGATCTGGGTCCAGTAGCTCTGACACTCCTCAGCTGT 
AGAAACCTTGACAACTCTGCACATCAGTTGTACAATGGAA 
CGGTAi I I I I I ACTCTTCATGTCTGAAAAGGCTATGATAA 
AGATCAAgtaagatatt...3' 



D - 5'...GTTTCCTTCTTCTGTCGGGGCGCCTTGGCp|3AGTGG 



AGGMTMGAAMGGAGCGATTGGCTGTCGMGiGTGCTCA 
GAACTGCTGGAGTGGAGGgtgtgtaacc...3' 



FTGURF It 



Printed from Mimosa 01/09/21 09 02 22 Page 39 



WO 99/16872 



PCT'A I! 98/008 P 



5/20 



FIGURE 5 TRANSCRIPT 6 

(Sequence Range: ] to 1463) 

10 20 30 40 50 

* * * * * * * * * * 

GTTTCCTTCT TCTGTCGGGG CGCCTTGGCA TGGAGTGGAG G A AT A AG AAA 
CAAAGGAAGA AGACAGCCCC GCGGAACCGT ACCTCACCTC CTTATTCTTT 

MetGluTrpArg AsnLysLys> 

60 70 80 90 100 

* # * *■ # * * * ■** 

AGGAGCG ATT GGCTGTCGAT GGTGCTCAGA ACTGCTGGAG TGGAGGAAGC 
TCCTCGCTAA CCGACAGCTA CCACGAGTCT TGACGACCTC ACCTCCTTCG 
ArgSerAsp TrpLeuSerMet ValLeuArg ThrAlaGly ValGluGluAla> 

110 120 130 14 D 15C 

* * * * * + * * * * 

CTTTGGGTCT GAAGTGTCTG TGAGACCTCA CAGAAGAGCA CCCCTGGGCT 
GAAACCCAGA CTTCACAGAC ACTCTGGAGT GTCTTCTCGT GGGGACCCGA 
PheGlySer GluValSer ValArgProHis ArgArgAla ProLeuGly> 

160 170 180 150 200 

* * * * * * * * * * 

CCACTTACCT GCCCCCTGCT CCTTCAGGGA TGGAGGCAAT GGCGGCCAGC 
GGTGAATGGA CGGGGGACGA GGAAGTCCCT ACCTCCGTTA CCGCCGGTCG 
SerThrTyrLeu ProProAla ProSerGly MetGIuAlaMet AlaAlaSer> 

210 220 230 240 ?.S0 

* * * -* * * « * * 

ACTTCCCTGC CTGACCCTGG AGACTTTGAC CGGAACGTGC CJCGGATCTG 
TGAAGGGACG GACTGGGACC TCTGAAACTG GCCTTGCACG GGGCCTAGA'J 
ThrSerLeu ProAspProGly AspPheAsp ArgAsnYa 1 ProArg I 1 eCyi;> 

260 270 280 290 300 

* * * ★ * •* * * » * 

TGGGGTGTGT GGAGACCGAG CCACTGGCTT TCACTTCAAT GCTATGACCT 
ACCCCACACA CCTCTGGCTC GGTGACCGAA AGTGAAGTPA CGATACTGGA 
GlyValCys GlyAspArg AlaThrGlyPhe HisPheAsn AlaMetThr> 

310 320 330 340 350 

* * * * * * * + * 

GTGAAGGCTG CAAAGGCTTC TTCAGGCGAA GCATGAAGCG GAAGGCACTA 
CACTTCCGAC GTTTCCGAAG AAGTCCGCTT CGTACTTCGC CTTCCGTGAT 
CysGluGlyCys LysGlyPhe PheArgArg SerMetLysArg LysAlaLeu> 

360 370 380 390 400 

* * * * »♦ * * * * 

TTCACCTGCC CCTTCAACGG GGACTGCCGC ATCA^CAAGG ACAACCGACG 
AAGTGGACGG GGAAGTTGCC CCTGACGGCG TAGTGGTTCC T3TTGGCTGC 
PheThrCys ProPheAsnGly AspCysArg IleThrLys AspAsnArgArg> 



Printed from Mimosa 01/09/21 09 02 22 Page 40 



WO 99/16872 



PCT/ A V 98/008 H 



6/20 

410 420 430 440 450 

* * * * * * * * * * 

CCACTGCCAG GCCTGCCGGC TCAAACGCTG TGTGGACATC GGCATGATCA 
GGTGACGGTC CGGACGGCCG AGTTTGCGAC ACACCTGTAG CCGTACTAT'T 
HisCysGln AlaCysArg LeuLysArgCys ValAspIle GlyMetMet> 

460 470 480 490 500 

* * * * * * * * * * 

AGGAGTTCAT TCTGACAGAT GAGGAAGTGC AGAGGAAGCG GGAGATGA7C 
TCCTCAAGTA AGACTGTCTA CTCCTTCACG TCTCCTTCGC CCTCTACTAG 
LysGluPhelle LeuThrAsp GluGluVal GlnArgLysArg GluMetIle> 

510 520 530 540 550 

A * * * * * * * * * 

CTGAAGCGGA AGGAGGAGGA GGCCTTGAAG GACAGTCTGO G3CCCAAGCT 
GACTTCGCCT TCCTCCTCCT CCGGAACTTC CTGTCAGACG CCGGGTTCGA 
LeuLysArg LysGluGluGlu AlaLeuLys AnpSerLeu ArgProLysLeu> 

560 570 580 590 600 

+ * * * * * »*• # * 

GTCTGAGGAG CAGCAGCGCA TCATTGCCAT ACTGCTGGAC GCCCACCATA 
CAGACTCCTC GTCGTCGCGT AGTAACGGTA TGACGACCTG CGGGTGGTAT 
SerGluGlu GlnGlnArg IlelleAlalle LeuLeuAsp AlaHisHis> 

610 620 630 640 650 

4 * * * * * * * * * 

AGACCTACGA CCCCACCTAC TCCGACTTCT GCCACTTCCC GCCTCCAGTT 
TCTGGATGCT GGGGTGGATG AGGCTGAAGA CGGTCAAGGC CGGAGGTCAA 
LysThrlV-Asp ProThrTyr SerAspPhe CysGlnPheArg Pro?roVal> 

660 670 680 69 3 700 

* * * * * 4 A A * + 

CGTGTGAATG ATGGTGGAGG GAGCCATCCT TCCAGGCCCA AC TO C AG AC A 
GCACACTTAC TACCACCTCC CTCGGTAGGA AGGTCCGGGT TGAGGTCTGT 
ArgValAsn AspGlyGlyGly SerHisPro SerArgPro Ar,nSorArgHis> 

710 720 730 74) 750 

** * * * * ** * * 

CACTCCCAGC TTCTCTGGGG ACTCCTCCTC CTCCTGCTCA GATCACTGTA 
GTGAGGGTCG AAGAGACCCC TGAGGAGGAG GAGGACGAGT CTAGTGACAT 
ThrProSer PheSerGly AspSerSerSer SerCysSer AspHisCys> 

760 770 780 790 800 

* + * + * # * * «*• 

TCACCTCTTC AGACATGATG GACTCGTCCA GGTTCTCCAA TCTGGATCTG 
AGTGGAGAAG TCTGTACTAC CTGAGCAGGT CGAAGAGGTT AGACCTAGAC 
IleThrSerSer AspMetMet AspSerSor ScrPheSerAsn LeuAspLeu> 

810 820 830 840 850 

+ * * * * * * * + * 

AGTGAAGAAG ATTCAGATGA CCCTTCTGTG ACCCTAGAGC TGTCCCAGCT 
TCACTTCTTC TAAGTCTACT GGG AAGACAC TGGGATCTCG ACAGGGTCGA 
SerGluGlu AspSerAspAsp ProSerVal ThrLeuGlu LeuScrGlnLc?u> 
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860 870 880 890 900 

* * * * * * ** * * 

CTCCATGCTG CCCCACCTGG CTGACCTGGT CAGTTACAGC ATCCAAAAGG 
GAGGTACGAC GGGGTGGACC GACTGGACCA GTCAATGTCG TAGGTTTTCC 
SerMetLeu ProHisLeu AlaAspLeuVal SerTyrSer IleGlnLys> 

910 920 930 940 950 

* * * * * * * * * * 

TCATTGGGTT TGCTAAGATG ATACCAGGAT TCAGAGACCT CACCTCTGAG 
AGTAACCGAA ACGATTCTAC TATGGTCCTA AGTCTCTGGA GTGGAGACTC 
VallleGlyrhe AlaLysMct IlcProGly PheArgAspLeu ThrSerGlu> 

960 970 980 990 1000 

* * * * * * * * * « 

GACCAGATCG TACTGCTGAA GTCAAGTGCC A7TGAGGTCA TCATGTTGCG 
CTGGTCTAGC ATGACGACTT CAGTTCACGG TAACTCCAGT AGTACAACGC 
AspGlnlle ValLeuLeuLys SerSerAla IleGluVal IleMetLeuArg> 

1010 1020 1030 1040 1050 

* * * * * * * * * * 

CTCCAATGAG TCCTTCACCA TGGACGACAT GTC TTGGACC TGTGGCAACC 
GAGGTTACTC AGGAAGTGGT ACCTGCTGTA CAG3ACCTGG ACACCGTTGG 
SerAsnGlu SerPhcThr MctAspAspMet SerTrpThr CysGlyAsn> 

1060 1070 1080 1090 1100 

* * * * * * + * * * 

AAGACTACAA GTACCGCGTC AGTGAGGTGA CCAAAGCCGG AC A JAGCCTG 
TTCTGATGTT CATGGCGCAG TCACTGCACT GGTTTCGGCC TGTGTCGGAC 
GlnAspTyrLys TyrArgYal SerAspVal ThrLysAlaGly HisScrLcu> 

1110 1120 1130 1140 1150 

* * * * * * * * + * 

GAGCTGATTG AGCCCCTCAT CAAGTTCCAG GTGGGACTGA AGAAGCTGAA 
CTCGACTAAC TCGGGGAGT A GTTCAAGGTC CACCCTGACT TCTTCGACTT 
CluLeuIle GluProLeuIle LysPheGln ValGlyLeu LysLysLeuAsn> 

1160 1170 1180 1190 1200 

* * * * « * * * * ★ 

CTTGCATGAG GAGGAGGATG TCCTGCTCAT GGCCATCTGC ATCGTCTCCC 
GAACGTACTC CTCCTCGTAC AGGACGAGTA CCGGTAGACG T AGC AG AGGG 
LeuHisGlu GluGluHis ValLeuLeuMet AlalleCys IleValSer> 

1210 1220 1230 1240 1250 

* * * * * * * * ** 

CAGATCGTCC TGGGGTGCAG GACGCCGCGC TGATTGAGGC CATCCAGGAC 
GTCTAGCAGG ACCCCACGTC CTGCGGCGCG ACTAACTCCG GTAGGTCCTG 
ProAspArgPro GlyValGln AspAlaAla LeuIleGluAla IleGlnAsp> 



1260 1270 128C 1290 1300 

* * + * * * * * * * 

CGCCTGTCCA ACACACTGCA GACGTACATC CGCTGCCGCC ACCCGCCCCC 
GCGGACAGGT TGTGTGACGT CTGCATGTAG GGGACGGCGG TGGGCGGGGG 
ArgLeuScr AsnThrLeuGln ThrTyrlle ArgCysArg HisPro?roPro> 
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1310 1320 1330 1340 1350 

* * * * ★ * * * * * 

GGGCAGCCAC CTGCTCTATG CCAAGATGAT CCAGAAGCTA GCCGACCTGC 
CCCGTCGGTG GACGAGATAC GGTTCTACTA GGTCTTCGAT Ci^GCTGGACG 
GlySerHis LeuLeuTyr AlaLysMetlle GlnLysLeu AlaAspLeu> 

1360 1370 1380 1390 1400 

* * * * * * * * + • 

GCAGCCTCAA TGAGGAGCAC TCCAAGCAGT ACCGCTGCCT CTCCTTCCAG 
CGTCGGAGTT ACTCCTCGTG AGGTTCGTCA TGGCGACGGA GAGGAAGGTC 
ArgSerLeuAsn GluGluHis SerLysGln TyrArgCysLeu SerPheGln? 

1410 1420 1430 1440 1450 

* * * * * * ** * * 

CCTGAGTGCA GCATGAAGCT AACGCCCCTT GTGCTCGAAG TGTTTGGCAA 
GGACTCACGT CGTACTTCGA TTGCGGGGAA CACGAGCTTC ACAAACCGTT 
ProGluCys SerMetLysLeu ThrProLeu ValLeuGlu Val?heGlyAsn> 



1460 
* * 

TGAGATCTCC TGA 
ACTCTAGAGG ACT 
GluIleSer ***> 
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FIGURE 6 TRANSCRIPT 9 

(Sequence Range: 1 to 13 32} 

10 20 30 40 50 

** * * * * ★ * * * 

GTTTCCTTCT TCTGTCGGGG CGCCTTGGCA TGGAGTGGAG G AAT AAG AAA 
CAAAGGAAGA AGACAGCCCC GCGGAACCGT ACCTCACCTC CTTATTCTTT 

MetGluTrpArg AsnLysLys> 

60 70 80 90 100 

* * * * * * * * * * 

AGGAGCGATT GGCTGTCGAT GGTGCTCAGA ACTGCTGGAG TGGAGGGGAT 
TCCTCGCTAA CCGACAGCTA CCACGAGTCT TGACGACCTC ACCTCCCCTA 
ArgSerAsp TrpLeuSerMet ValLeuArg ThrAlaGIy ValGluGlyMet> 

110 120 130 140 150 

+ * * * * * * * * * 

GGAGGCAATG GCGGCCAGCA CTTCCCTGCC TGACCCTGGA GACTTTGACC 
CCTCCGTTAC CGCCGGTCGT GAAGGGACGG ACTGGGACCT CTGAAACTGG 
GluAlaMet AlaAlaSer ThrSerLeuPro AspProGly AspPheAsp> 

160 170 180 190 200 

* * * * * * * * *■* 

GGAACGTGCC CCGGATCTGT GGGGTGTGTG GAGACCGAGC CACTGGCTTT 
CCTTGCACGG GGCCTAGACA CCCCACACAC CTCTGGCTCG GTGACCGAAA 
ArgAsnValPro ArglleCys GlyValCys GlyAspArgAla ThrGlyPhe> 

210 220 230 240 250 

* * * * * * * * * ■* 

CACTTCAATG CTATGACCTG TGAAGGCTGC AAAGGCTTCT TCAGGCGAAG 
GTGAAGTTAC GATACTGGAC ACTTCCGACG TTTCCGAAGA AGTCCGCTTC 
HisPhcAsn AlaKetThrCys GluGlyCys LysGlyPhe PheArqArgSer> 

260 270 280 290 300 

* * * * * * * * * * 

CATGAAGCGG AAGGCACTAT TCACCTGCCC CTTCAACGGG GACTGCCGCA 
GTACTTCGCC TTCCGTGATA AGTGGACGGG GAAGTTGCCC CTGACGGCGT 
MetLysArg LysAlaLeu PheThrCysPro PheAsnGly AspCysArg> 

310 320 330 340 350 

* * * * * * + * * * 

TCACCAAGGA CAACCGACGC CACTGCCAGG CCTGCCGGCT CAAACGCTGT 
AGTGGTTCCT GTTGGCTGCG GTGACGGTCC GGACGGCCGA GTTTGCGACA 
IleThrLysAsp AsnArgArg HisCysGln AlaCysArgLeu LysArgCys> 

360 370 380 390 400 

* * * * * * ** * * 

GTGGAGATCG GCATGATGAA GGAGTTCATT CTGACAGATG AGGAAGTGCA 
CACCTGTAGC CGTACTACTT CCTCAAGTAA GACTGTCTAC TCCTTCACGT 
ValAspIle GlyMetMetLys GluPhelle LeuThrAsp GluGluValGln> 
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410 420 430 440 450 

** * * * * * * * * 

GAGGAAGCGG GAGATGATCC TGAAGCGGAA GGAGGAGGAG GCCTTGAAGG 
CTCCTTCGCC CTCTACTAGG ACTTCGCCTT CCTCCTCCTC CGGAACTTCC 
ArgLysArg GluMetlle LeuLysArgLys GluGluGlu AlaLeuLys> 

460 470 48C 490 500 

* * * * * * * * * ♦ 

ACAGTCTGCG GCCCAAGCTG TCTGAGGAGC AGCAGCGCAT CATTGCCATA 
TG T C AG AC G C CGGGTTCGAC AGACTCCTCG TCGTCGCGTA GTAACGGTAT 
AspSerLeuArg ProLysLeu SerGluGlu GlnGlnArglle IleAlaIle> 



510 520 530 540 550 

CTGCTGGACG CCCACCATAA GACCTACGAC CCCACCTACT CCGACTTCTG 
GACGACCTGC GGGTGGTATT CTGGATGCTG G3GTGGATGA GGCTGAAGAC 
LouLeuAsp AlaHisHisLys ThrTyrAsp ProThrTyr SerAspPheCys > 

560 570 580 590 600 

* * * * ** * * * * 

CCAGTTCCGG CCTCCAGTTC GTGTGAATGA TGGTGGAGGG AGCCATCCTT 
GGTCAAGGCC GGAGGTCAAG CACACTTACT ACCACCTCCC TCGGTAGGAA 
GlnPheArg ProProVal ArgValAsnAsp GlyGlyGly SerHisPro> 

610 620 630 640 650 

** * * * + w * * * 

CCAGGCCCAA CTCCAGACAC ACTCCCAGCT TCTCTGGGGA CTCCTCCTCC 
GGTCCGGGTT GAGGTCTGTG TGAGGGTCGA AGAGACCCCT GAGGAGGAGG 
SerArgProAsn SerArgHis ThrPrcSer PheSerGlyAsp SerSerSer> 

660 670 680 690 700 

* * * * * + * * 

TCCTGCTCAG ATCACTGTAT CACCTCTTCA GACATGATGG ACTCGTCCAG 
AGGACGAGTC TAGTGACATA GTGGAGAAGT CTGTACTACC TGAGCAGGTC 
GerCysSer AspHisCysIle ThrSerScr AnpMeLMet AspSer Ser Ser > 

710 720 730 740 750 

* * » * * * * ★ * * 

CTTCTCCAAT CTGGATCTGA GTGAAGAAGA TTCAGATGAC CCTTCTGTGA 
GAAGAGGTTA GACCTAGACT CACTTCTTCT AAGTCTACTG GGAAGACACT 
PheSerAsn LeuAspLeu SerGluGluAsp SerAspAsp ProSerVal> 

760 770 780 790 800 

* * * * * * * * * * 

CCCTAGAGCT GTCCCAGCTC TCCATGCTGC CCCACCTGGC TGACCTGGTC 
GGGATCTCGA CAGGGTCGAG AGGTACGACG GGGTGGACCG ACTGGACCAG 
ThrLouGluLou SerGlnLeu SerMetLeu ProHisLeuAla AspLeuVal> 

810 820 830 840 850 

* * ** * * * + * * 

AGTTACAGCA T C C AAAAGG T CATTGGCTTT GCTAAGATGA TACCAGGATT 
TCAATGTCGT AGGTTTTCCA GTAACCGAAA CGATTCTACT ATGGTCCTAA 
SerTyrSer IleGlnLysVal IleGlyPhe AlaLysMet IleProGlyPhe> 
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860 870 880 890 900 

** * * * * * * * * 

CAGAGACCTC ACCTCTGAGG ACCAGATCGT ACTGCTGAAG TCAAGTGCCA 
GTCTCTGGAG TGGAGACTCC TGGTCTAGCA TGACGACTTC AGTTCACGGT 
ArgAspLeu ThrSerGlu AspGlnlleVal LeuLcuLys ScrSerAla> 

910 920 930 940 950 

* 4 * * * * * * * * 

TTGAC5GTCAT CATGTTGCGC TCCAATGAGT CCTTCACCAT GGACGACATG 
AACTCCAGTA GTACAACGCG AQGTTACTCA GGAAGTGGTA CCTGCTGTAC 
IleGluVallle MetLeuArg SerAsnGlu SerPheThrMet AspAspMeo 

960 970 980 990 1000 

* * * * * * * * * * 

TCCTGGACCT gtggcaacca agactacaag taccgcgtca GTGACGTGAC 

AGGACCTGGA CACCGTTGGT TCTGATGTTC ATGGCGCAGT CACTGCACTG 
SerTrpThr CysGlyAsnGln AspTyrLys TyrArgVal SerAspValThr> 

1010 1020 1030 1040 1050 

*•* ** * * *'* #*- 

CAAAGCC<jGA CACAGCCTGG AGCTGATTGA GCCCCTCATC AAGTTCCAGG 
GTTTCGGCCT GTGTCGGACC TCGACTAACT OjGGGAGTAG TTCAAGGTCC 
LysAlaGly HisSerLeu GluLeuIieGlu ProLeuIle LysPheGln> 

1060 1070 1080 1090 1100 

* * ** ** * * * * 

TGGGACTGAA GAAGCTGAAC TTGCATGA3G AGGAGCATGT CCTGCTCATG 
ACCCTGACTT CTTCGACTTG AACGTACT 2C TCCTCGTACA GGACGAGTAC 
ValGlyLeuLys LysLeuAsn LeuHisGlu GluGluHisVal LeuLeuMeo 

1110 1120 113C 1140 1150 

* * * * * * «* * * 

GCCATCTGCA TCGTCTCCCC AGATCGTCCT G3GGTG:AGG ACGCCGCGCT 
CGGTAGACGT AGCAGAGGGG T'JTAGCAGGA CCCOAC3TOC TGCGGCGCGA 
AlalleCy? T 1 eVa ISer Pro AnpArgPro GlyValGln A s p A 1 a A 1 a Leu > 

1160 1170 1180 1190 1200 

* + * * « * * * * * 

GATTGAOGCC ATCCAGGACC GCCTGTCCAA CACACTGCAG ACGTACATCC 
CTAACTCCGG TAGGTCCTGG CGGACAGGTT GTGTGACGTC TGCATGTAGG 
IleGluAla IleGlnAsp ArgLeuSerAsn ThrLeuGln ThrTyrIle> 

1210 1220 1230 1240 1250 

* * * + * * * * * * 

GCTGCCGCCA CCCGCCCCCG GGCAGCCACC TGCT2TATGC CAAGATGATC 
CGACGGCGGT GGGCGGGGGC CCGTCGGTGG ACGA GATACG GTTCTACTAG 
ArgCysArgHin ProProPro GlySerHis LeuLeuTyrAla LysMotIle> 

1260 1270 1280 1200 1300 

* * ♦ * * * * * + * 

CAGAAGCTAG CCGACCTGCG CAGCCTCAAT GAGGAGCACT CCAAGCAGTA 
GTCTTCGATC GGCTGGACGC GTCGG AGTT A C7CCTCGTGA GGTTCGTCAT 
Glr.LysLeu Al aAspLeuArg SciLeuAnn GluGluHis SerLysGlnTyr> 
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1310 1320 1330 1340 1350 

* * * * * * * * * * 

CCGCTGCCTC TCCTTCCAGC CTGAGTGCAG CATGAAGCTA ACGCCCCTTG 
GGCGACGGAG AGGAAGGTCG GACTCACGTC GTACTTCGAT TGCGGGGAAC 
ArgCysLeu SerPheGln ProGluCysSer MetLysLeu ThrProLeu> 

1360 1370 1380 

* * * * * * 

TGCTCGAAGT GTTTGGCAAT GAGATCTCCT GA 
ACGAGCTTCA CAAACCGTTA CTCTAGAGGA CT 
ValLeuGluVal PheGlyAsn GluIleSer ***> 
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FIGURE 7 TRANSCRIPT 10 



(Sequence Range: 1 to 1534) 

10 20 30 40 50 

* * * * * * * * * * 

GTTTCCTTCT TCTGTCGGGG CGCCTTGGCA TGGAGTGGAG GAATAAGAAA 
CAAAGGAAGA AGACAGCCCC GCGGAACCGT ACCTCACCTC CTTATTCTTT 

MetGluTrpArg AsnLysLys> 

60 70 80 90 100 

* * * * * * * * * * 

AGGAGCGATT GGCTGTCGAT GGTGCTCAGA ACTGCTGGAG TGGAGGGGAT 
TCCTCGCTAA CCGACAGCTA CCACGAGTCT TGACGACCTC ACCTCCCCTA 
ArgSerAsp TrpLeuSerMet ValLeuArg ThrAlaGly ValGluGlyMet > 

110 120 130 140 150 

* * + * * * * * * * 

GGAGGCAATG GCGGCCAGCA CTTCCCTGCC TGACCCTGGA GACTTTGACC 
CCTCCGTTAC CGCCGGTCGT GAAGGGACGG ACTGGGACCT CTGAAACTGG 
GluAlaMet AlaAlaSer ThrSerLeuPro AspProGly AspPheAsp> 

160 170 180 190 200 

* * + * * * * * * * 

GGAACGTGCC CCGGATCTGT GGGGTGTGTG GAGACCGAGC CACTGGCTTT 
CCTTGCACGG GGCCTAGACA CCCCACACAC CTCTGGCTCG GTGACCGAAA 
ArgAsnValPro ArglleCys GlyValCys GlyAspArgAla ThrGlyPhe> 

210 220 230 240 250 

* * * * * + * * ♦ * 

CACTTCAATG CTATGACCTG TGAAGGCTGC AAAGGCTTCT TCAGGTGAGC 
GTGAAGTTAC GATACTGGAC ACTTCCGACG TTTCCGAAGA AGTCCACTCG 
HisPheAsn AlaMetThrCys GluGlyCys LysGlyPhe PheArg*** 

260 270 280 290 300 

* * * * ** * * * * 

CCCCCTCCCA GGCTCTCCCC AGTGGAAAGG GAGGGAGAAG AAGCAAGGTG 
GGGGGAGGGT CCGAGAGGGG TCACCTTTCC CTCCCTCTTC TTCGTTCCAC 

310 320 330 340 350 

* * * * * * * * ** 

TTTCCATGAA GGGAGCCCTT GCATTTTTCA CATCTCCTTC CTTACAATGT 
AAAGGTACTT CCCTCGGGAA CGTAAAAAGT GTAGAGGAAG GAATGTTACA 

360 370 380 390 400 

* * * * * * * * ** 

CCATGGAACA TGCGGCGCTC ACAGCCACAG GAGCAGGAGG GTCTTGGCGA 
GGTACCTTGT ACGCCGCGAG TGTCGGTGTC CTCGTCCTCC CAGAACCGCT 
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410 420 430 440 450 

* * * * * * # * * * 

AGCATGAAGC GGAAGGCACT ATTCACCTGC CCCTTCAACG GGGACTGCCG 
TCGTACTTCG CCTTCCGTGA TAAGTGGACG GGGAAGTTGC CCCTGACGGC 

460 470 430 49C 500 

* # * * * * * * • * 

CATCACCAAG GACAACCGAC GCCACTGCCA GGCCTGCCGG CTCAAACGCT 
GTAGTGGTTC CTGTTGGCTG CGGTGACGGT CCGGACGGCC GAGTTTGCGA 

510 520 530 540 550 

* * * * * * * * * * 

GTGTGGACAT CGGCATGATG AAGGAGTTCA TTCTGACAGA TGAGGAAGTG 
CACACCTGTA GCCGTACTAC TTCCTCAAGT AAGACTGTCT ACTCCTTCAC 

560 570 580 590 600 

* * * * * * * * * * 

CAGAGGAAGC GGGAGATGAT CCTGAAGCGG AAGGAGGAGG AGGCCTTGAA 
GTCTCCTTCG CCCTCTACTA GGACTTCGCC TTCCTGCTCG TCCGGAACTT 

610 620 630 640 650 

» + * * * * * * ** 

GGACAGTCTG CGGCCCAAGC TGTCTGAGGA GCAGCAGCGC ATCATTGCCA 
CCTGTCAGAC GCCGGGTTCG ACAGACTCCT CGTCGTCGCG TAGTAACGGT 

660 670 680 690 700 

* * * * * * * * 

TACTGCTGGA CGCCCAGCAT AAGACCTACG ACCCCACCTA CTCCGACTTC 
ATGACGACCT GCGGGTGGTA TTCTGGATGC TGGGGTGGAT GAGGGTGAAG 

710 720 730 740 750 

-* * * * * * * * + * 

TGCCAGTTCC GGCCTCCAGT TCGTGTGAAT GATGGTGGAG GGAGCCATCC 
ACGGTCAAGG CCGGAGGTCA AGCACACTTA CTACCACCTC CCTCGGT AGG 

760 770 780 790 SCO 

* * * + * * * w * * 

TTCCAGGGCC AACTGCAGAC ACACTGCCAG CTTCTCTGGG GACTCCTCCT 
AAGGTCCGGG TTGAGGTCTG TGTGAGGGTC GAAGAGACCC CTGAGGAGGA 

810 820 830 840 850 

4 * * *• + + * * * 4 

CCTCCTGCTC AGATCACTGT ATCACCTCTT CAGACATGAT GGACTCGTCC 
GGAGGACGAG TCTAGTGACA TAGTGGAGAA GTCTGTACTA CCTGAGCAGG 

860 870 890 B90 900 

* * + * * * * * * * 

AGCTTCTCCA ATCTGGATCT GAGTGAAGAA GATTCAGATG ACCCTTCTGT 
TCGAAGAGGT TAGACCTAGA CTCACTTCTT CTAAGTCTA- TGGGAAGACA 

910 920 930 940 950 

* * * # * * * * * * 

GACCCTAGAG CTGTCCCAGC TCTCCATGCT GCCCCACCTG GGTGACCTGG 
CTGGGATCTC GACAGGGTCG AGAGGTACGA CGGGGTGGAC CGACTGGACC 
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960 970 980 990 1000 

* * * * * * * * * # 

TCAGTTACAG CATCCAAAAG GTCATTGGCT TTGCTAAGAT GATACCAGGA 
AGTCAATGTC GTAGGTTTTC CAG7AA.CCGA AACGATTCTA CTATGGTCCT 

1010 1020 1030 1040 1050 

* * * * * * * * * * 

TTCAGAGACC TCACCTCTGA GGACCAGATC GTACTGCTGA AGTCAAGTGC 
AAGTCTCTGG AGTGGAGACT CCTGGTCTAG CATGACGACT TCAGTTCACG 

1060 1070 1080 1090 1100 

* * * * * * * * * * 

CATTGAGGTC ATCATGTTGC GCTCCAATGA GTCCTTCACC ATGGACGACA 
GTAACTCCAG TAGTACAACG CGAGGTTACT CAGGAAGTGG TACCTGCTGT 

1110 1120 1130 1140 1150 

* * * * * # ★ * ** 

TGTCCTGGAC CTGTGGCAAC CAAGAGTACA AGTACCGCGT CAGTGACGTG 
ACAGGACCTG GACACCGTTG GTTCTGATGT TCATGGCGCA GTCACTGCAC 

1160 1170 1130 1190 1200 

* * * * * * + * * * 

ACCAAAGCCG GACACAGCCT GGAGCTGATT GAGCCCCTCA TCAAGTTCCA 
TGGTTTCGGC CTGTGTCGGA CCTCGACTAA CTCGGGGAGT AGTTCAAGGT 

1210 1220 1230 124C 1250 

* * * * ** * * * * 

GGTGGGACTG AAGAAGCTGA ACTTGCATGA GGAGGAGCAT GTCCTGCTCA 
CCACCCTGAC TTCTTCGACT TGAACGTACT CCTCCTCGTA CAGGACGAGT 

1260 1270 1280 1290 1300 

* * **- * * * * * + 

TGGCCATCTG CATCGTCTCC CCAGATCGTC CTGGGGTGCA GGACGCCGCG 
ACCGGTAGAC GTAGCAGAGG GGTCTAGCAG GACCCCACGT CCTGCGGCGC 

1310 1320 1330 1340 1350 

* ★ * * * * * + * * 

CTGATTGAGG CCATCCAGGA CCGCCTGTCC AACACACTGC AGACGTACAT 
GACTAACTCC GGTAGGTCCT GGCGGACAGG TTGTGTGACG TCTGCATGTA 

1360 1370 1380 1390 1400 

* * * * * * * * * * 

CCGCTGCCGC CACCCGCCCC CGGGCAGCCA CCTGCTCTAT GCCAAGATGA 
GGCGAGGGCG GTGGGCGGGG GCCCGTCGGT GGACGAGATA CGGTTCTACT 

1410 1420 1430 1440 1450 

** * * * * * * * * 

TGCAGAAGCT AGCCGACCTG CGCAGCCTCA ATGAGGAGCA CTCCAAGCAG 
AGGTCTTCGA TCGGCTGGAC GCGTCG3AGT TACTCCTCGT GAGGTTCGTC 

1460 1470 1480 1490 15C0 

* ♦ ** * * ■»* * * 

TACCGGTGCC TCTCCTTCCA GCCTGAGTGC AGCATGAAGC TAACGCCCCT 
ATGGCGACGG AGAGGAAGGT CGGACTCACG TCGTACTTCG ATTGCGGGGA 
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1510 1520 1530 

* * * * * * 

TGTGCTCGAA GTGTTTGGCA ATGAGATCTC CTGA 
ACACGAGCTT CACAAACCGT TACTCTAGAG GACT 
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FIGURE 8 TRANSCRIPT 11 

10 20 30 4C 50 

***** 

TGCGACCTTG GOGGTGAGCC TGGGGA'JAGG GGTGAGGCCA GAGACGGACG 
ACGCTGGAAC CGCCACTCGG ACCCCTGTCC CCACTCCGGT CTCTGCCTG : 

6 J 70 8 0 90 130 

***** 

GACGCAGGGG CCCGGCCCAA GGCGAGGGAG AACAGCGGCA C 2AAGGCAGA 
CTGCGTCCCC GGGCCGGGTT CCGCTCC2rO TTGTCC' VGT CATTCCGTCT 

no 120 130 L4o iso 

AAGGAAGAGG G JGGTGTGTT CACCCGCACC CCAATCCATC ACT CAGCAAC 
TTCCTTCTCG rc :CACACAA GTGGGCGTCG GGTTAGGTAG TGAGTCGTTG 

16) 170 180 L90 20 3 

***** 

TCCTAGACGG TGGTAGAAAG TTCCTCC GAG GAGC 2 T GOG A TCCAGTCGTG 
AGGATCTGCG ACCATCTTTC AAGGAGG CTC CTCG jACGGT AGGTCAGGAC 

210 220 2 30 .MO 2 5') 

* * * * i 

CGTGCAGAAG CCTTTGGGTC TSAAGTGTC 1' GTGA j A GOTO ACAGAAGAGC 
GCACGTCTTG GGAAACCCAG ACT TC AC AG A CACT2TGGAG TGTCTTCTCG 

260 210 280 290 300 

***** 

ACCGCTGGGC TCOACTTACC TGCCCCCTG2 TCCTTCAGGG ATGGAGGOAA 
TGGGGACCCG AGGTGAATGG ACGGGGGACG AGGAAG7CCC TAC2TCCGTT 

MeLGluAl n> 

310 32 0 3 30 HO 3 50 

* * * * * 

TGGCGGCCAG CA^TTCCCTG CCTGAOCCTG GAGAGTT7GA CCGGAACGTG 
ACCGCCGGTC GTGAAGGGAC CGAGTGGGAO CTCTaAAACT GGCGT TGCAC 
MetAlaAl.iSor ThrSerLeu ProAspPro GlyAsp fh^A.sp ArvjA;;nVnl> 

3 60 37 0 380 3 Q 0 4 00 

***** 

CCCCGGATCT GTGGGGTGTG TG^AGACCGA GCCACTGGCT TTCACTTCAA 
GGGGCCTAGA CACCCCACAC AC CTCTGGCT CGGTGACCGA AAGTGAAGTT 
ProArglle CysGlyValCys G I yAspAr i AlaThrGly PheHis FheA;m> 

410 420 430 440 450 

* * * * * 

TGCTATGACC TGTOAAGGCT GCAAAGGCTT CTTCAGGCGA AGCATGAAGC 
ACGATACTGG ACACTTCCGA CGTTTCCGAA GAAGTCCGCT TCGTACTTCG 
AlaMetThr CysGluGly CysLysGlyPhe PheArgArq SerMctLys> 

460 470 480 490 500 

+ * * * + 

GGAAGGCACT ATTCACCTGC CCCTTCAACG GGGACTGCCG CATCACCAAG 
CCTTCCGTGA TAAGTGGACG GG 3AAGTTGC CCCTGACGGC GTAGTGGTTC 
ArgLysAlaLeu PheThrCys PtoPheAr.n GlyAspCysArg IleThrLys> 
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5io v.o biJ :->4o bb(; 

***** 

GACAACCGAG GCCACTGCCA GGCCTGCCGG GTGAAAGCCT ctgtggacat 
CTGTTGG^TG CGGTGACGGT CCGGACGGCC GAGTTTGCGA CACACCTGTA 
AspAsnArg ArgHi sCys G 1 n Al aCysAr g LeuLysArg Cys ValAsp I 1 e> 

■360 b/0 590 5 30 6C0 

***** 

CGGCATGAT^ AAGCAGTTCA TTCTGACAGA TGAGGAAGTG 'JAGAGGAAGC 
GCCGTACTA'J TTCCT 3AAGT AAGACTCTCT AG70GTTCAC GTCTCCTTCG 
GlyMetMet LysGluPhe lie LeuTh rAsp SluG.luV.al GiiiAr .jl.ys ^ 

610 6J0 63) 610 6b0 

G G G A GAT GAT CCTGAAGCGG AAG GAG GAG G AGGCCTTGAA GGAGAGTCTG 
COCTCTAJTA GGACTTCGGO TTCCTCCTCC TCCGGAACTT CCTGTCAGA J 
ArgGluMetl le LeuLysArg LysGluGlu GluA^OeuLys A:»pS^rLfMi » 

660 670 680 690 700 

***** 

CGGCCCAAGC TGTCTGAGGA GCAGCAGCGC ATCATTGCCA TATFGCTGGA 
GCCGGGTTCG ACAGAGTCCT CGTCGTCGCG TAGTAACGGT ATGACGACCT 
ArgProLys LeuSerGiuGlu GlnGlnArg T I e f l eAl a I 1 n I.^uLeuAGp> 

710 7;'0 730 740 750 

***** 

CGCCCACCAT AAGACCTACG ACCCCACCTA CTCCGACTTC TGCCAGTTCC 
GCGGGTGGTA TTCTGGATGC TGGGGTGGAT GAGGCTGAAG ACGGTCAAGG 
AlaHisUis LysThrTyr AspProThrTyr SerAspPhe CysGlnPhe> 

760 770 780 790 800 

***** 

GGCCTCCAGT TCGTGTGAAT GATGGTGGAG GGAGCCATCC TTCCAGGCC" 
CCGGAGGTCA AGCACACTTA CTACCACCTC CCTGGGTAGG AAGGTCCGG5 
ArgProProVal ArgValAsn AspG.l yGl y GlySerHisPto Se:Arq?to » 

810 820 83 J 8 10 R50 

****-* 

AACTCCA3A-: ACACTJCCAG CTTCTCTGGG GACTGOTCCT GCTCCTGCT Z 
TTGAGGTGTG TGTGAGGGTC GAAGAGACCC CTGAGGAGGA G GAG G AC GAG 
AsnSerAcq HisThr FroSer PheSerGl/ .SorSorCysSer> 

8G0 870 880 800 900 

***** 

AGATCACTGT ATCACCTCTT CAGACATGAT GGACTCGTCC AGC'ITCTCCA 
TCTAGTGACA TAGTG3AGAA GTCTGTACTA CCTGAGCAGG TCG.AAGAGGT 
AspHis^ys IlcThrSer SerAspMetMet Asp^ ; erSer SerF'heSer> 

910 920 930 940 950 

* * * * * 

atctcgat.:t gagtgaagaa gattcagatg agccttctgt gaccctagag 
tagacctaga ctcacttctt ctaagtctac tgggaagaca ctgggatct: 

AsnLeuAspLeu SerGluGlu AspSerAsp AspProSeiVal ThrI.euGlu:* 

960 970 98 J P 90 1000 

*■*♦*•*- 

ctgtccgagg tctccatgct gc-ccacctg g:tgacctgg tcagttacag 

GACAGG jTCG AGAGGTACGA CGGGGTGGAC GGA^TGGACC AGTCAATGT Z 
LeuSerGln LouSerMet Leu FroIIisLeu AlaAsp^eu ValSerTyr Ser > 
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1010 1020 1D30 1040 1050 

♦ * » * * 

CATCCAAAAG GTCATTCGCT TTGCTAAGAT GAT AC C A3 G A TTCAGAGACC 
GTAGGTTTTC CAGTAACCGA AACGATTCTA CTATGGTCCT AAGT2TCTGG 
IleGInLys VallieGly PheAiaLysMet IleProGLy PheArgAsp> 

1060 1070 1030 1090 1100 

***** 

TCACCTCTGA CGACCAGATC GTACTGCTGA AGTCAAG L'GC CATT GAGGTC 
AGTGGAGACT CCT GGTCTAG CAT SAC G ACT TCAGTTCACG CTAACTCCAG 
LeuThrSerGl w AspClnlle VaiLeuLeu Lys^et^ecAla IleJlu7al> 

1110 1120 1130 1H0 1150 

***** 

/vTCATGTTGC GCTCCAATGA GTCC'iTCACC ATGGACGACA TGTC CTGGAC 
TAGTACAACG C GAG GT TACT CAGGAAGTGG TACCTGCTGT ACAGGACCTG 
IleMetLeu Ar gSerAsnGl u SerFrioThr MetAspAsp Met Se. rTrpThr > 

11G0 1170 1180 1190 1200 

***** 

CTGTGGCAAC CAAGACTACA AGTAC-2GCGT CAGTGACGTG A C C AAA G C C G 
GACACCGTTG GTTCTGATGT TCATG GCGCA GTCACTGCAC TGGTTTCGGC 
CysGlyAsr. GlnAspTyr LysTyrArgVal SerAspVa] ThrLysAla > 

1210 1220 1230 124C 1250 

***** 

GACACAGCCT GGAGCTGA7T GAGCCCCTCA TCAAGTTCCA GGTGGGACTG 
CTGTGTCGGA CCTCGACTAA CTCGGGGAGT AGTTCAAGGT CCACCCTGAC 
GlyliisSenI.eu GluLculie GluProLeu IleLysPheGln ValGlyLeu> 

1260 1270 1280 1290 1300 

***** 

AAGAAGCTGA ACTT G CAT G A GGAG GAG CAT GTCCTGCTCA TGGCCATCTG 
TTCTTCGA2T TGAACGTACT CCTCCTCSTA CAGGACGAGT ACCGGTAGA I 
I.ysLysLeu AsnLeuIlisGlu GluGLuHis ValLeuT.eu Mr t A.L a 1 1 eCys > 

1310 1320 1330 1340 1350 

* * + * 

CATCGTCTJC CCAGATCGTC CTGGGGTGCA GGACGCCGC 3 CTGATTGAGG 
GTAGCAGAGG GGTCTAGCAG GACCCCACGT CCTGCGGCG'J SACTAACTC "! 
IleValSer ProAspAri ProGlyValGln AspAlaAl a LeuIleGlu- 

1360 1370 1380 1390 1400 

***** 

CCATCCAGGA CCGCCTGTCC AACACACTGC AGACGTACAT CCGCTGCCGC 
GGTAGGTCCT GGCGGACAGG TTGTGTGACG TCTGCATGTA GGCGACGGCG 
AlallcGlnAsp ArgLeuS-r AsnThrLeu GlnThrTyrlLe Ar<jCysArg> 

1410 1420 1430 1440 1453 

***** 

CACCCGCCCC CGGGCAGCCA CCTGCTCTAT GCCAAGATGA TCCAGAAGCT 
GTGGGCGGGG GCCCGTCGGT GGACGAGATA CGGTTCTACT AGGTCTTCGA 
HisProPro ProGlySerHis T.euLmiTyr AlaLysMet 1 1 eGlnLys Leu> 

1460 1470 1480 1490 1500 

***** 

AGCCGACCTG CGCAGCCTCA ATGAGGAGCA CTCCAAGCAG TACCGCTGCC 
TCGGCTGGAC GCGTCGGAGT TACT CCT CGT GAGGTTCGTC ATGGCGACGG 
AlaAspLeu AroSorLeu AsnGluGluiiis SerLysGln TyrArgCys> 
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1510 1520 1530 15-10 1550 

* * * + * 

TCTCCTTCCA GCCTGAGTGC AG CAT GAAG C TAACGCCCCT TGTGCTCGAA 
AGAGGAAGGT CGGACTCACG TCGTAGTTCG ATTGCGGGGA ACACGAGCTT 
LeuSerPheGln FroGluCys SerMetLys LeuTh r FroLou ValLeuGlu> 

1560 1570 

GTGTTTGGCA ATGAGATCTC CTGA 
CACAAACCGT TACT C TAG AG GACT 
ValPheGly AsnGluIleSer ***> 
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